The real bottleneck
A simple aggregation of notes I wrote in Obsidian. No new knowledge can be extracted from this article.
Since the publication of the scaling laws, which demonstrated that model performance improves as a predictable power-law function of model size, dataset size, and total training compute, the dominant constraint on progress appeared to be compute itself (the sheer number and quality of GPUs that could be brought to bear on a single training run). These laws turned what had once felt like alchemy into something closer to engineering: feed the system more of the right inputs in the right proportions, and performance reliably improves.
For years, the story of AI advancement has largely been the story of hardware progress and the extraordinary ability of labs to assemble the capital and infrastructure needed for larger training clusters. Most observers assumed this pattern would continue. And to a striking degree, it has. Yet beneath the surface of this compute driven narrative, another constraint has been quietly growing in importance: energy.
Countries that can rapidly expand reliable power generation and transmission are finding themselves better positioned as models scale. Some nations (China) have demonstrated the capacity to build large energy infrastructure projects on accelerated timelines, completing nuclear reactors in five to seven years and adding solar capacity at very large scale. This execution capability is becoming increasingly relevant to artificial intelligence development.
Elon Musk captured the coming shift when he noted that the limiting factor will move from chips to energy on Earth. A year earlier he had already warned that the constraint would progress "from chips to voltage transformers to electricity generation." Both observations point to the same underlying reality: even if scaling laws continue to hold, the next hard ceiling is not silicon fabrication but the ability to deliver reliable gigawatts where and when they are needed.
Major AI labs are already treating energy availability as a central variable. xAI's expansions of the Colossus cluster, OpenAI's Stargate project, and Google's ongoing TPU buildouts are structured around securing multi-gigawatt supplies.
Current frontier training runs already draw well over 100 megawatts. Detailed modeling by Epoch AI in partnership with the Electric Power Research Institute shows that power demand for the largest individual training runs has been growing at roughly 2.2 times per year. Continuing that trajectory points to single frontier training runs requiring on the order of 4 gigawatts by 2030.
The International Energy Agency projects that global data-center electricity consumption will more than double from around 415 terawatt-hours in 2024 to roughly 945 terawatt-hours by 2030.
To put these numbers in perspective, today's AI data centers still represent only a modest share of total electricity generation in large economies. What changes the equation is concentration and speed of demand. A sustained multi GW load cannot simply be plugged into an average section of the grid. Power must either be generated on site or drawn through transmission infrastructure capable of handling that load continuously and reliably.
This reality creates meaningful differences between regions.
In Europe, for example, many transmission networks are older and were not designed for the sudden, high-density loads created by large AI clusters. Regulatory processes for new infrastructure tend to be lengthy. Electricity prices are also significantly higher than in the United States or China in many European countries. These factors make it considerably more difficult to assemble the consistent, high-capacity power needed for frontier-scale training. A distributed approach across multiple countries could theoretically help, but coordinating such systems across aging grids and differing regulatory environments adds substantial complexity and risk.
One thing worth remembering is that electricity is not uniformly available across regions, and transmission bottlenecks are already visible in practice. Real incidents of localized overloads causing data centers to trip offline serve as early warnings of what could become more frequent as demand grows.
Political and social constraints add another layer. Local opposition can delay projects over concerns about noise, visual impact, water use, or effects on electricity prices for existing customers. Suitable sites must simultaneously offer reliable high-capacity power at competitive prices, available land, and access to skilled workers. These requirements naturally concentrate development in a limited number of locations, further intensifying pressure on specific regional grids.
Yet the situation is not one of inevitable limitation for every Western country. The United States retains significant advantages in capital markets, technological agility, and the ability to attract massive private investments. Where substantial resources and political will align, practical solutions tend to emerge. Hardware efficiency continues to advance. Distributed training across multiple sites can spread peak loads geographically.American power system possesses more flexibility than the most pessimistic forecasts suggest through accelerated natural gas deployment, rapid solar buildouts, and the economic reality that companies spending enormous sums on chips are willing to pay premiums for reliable electricity. This doesn’t mean there won’t be issues though...
The United States still has the capital, talent, and innovative capacity to close the gap, but only if it treats energy infrastructure with the same urgency that China has applied to its own buildout over the past decade.
Sources
Epoch AI & EPRI, Scaling Intelligence: The Exponential Growth of AI’s Power Needs (2025)
International Energy Agency, Energy and AI (2025)
Epoch AI, “Can AI scaling continue through 2030?” and “Is almost everyone wrong about America’s AI power problem?”
Belfer Center, AI, Data Centers, and the U.S. Electric Grid





Sì Let’s invest in energy stocks