Compute (AI Capabilities): Research Report
Executive Summary
Section titled “Executive Summary”| Finding | Key Data | Implication |
|---|---|---|
| Training costs escalating | 2.4x/year growth; GPT-4 cost $78M, Gemini Ultra $191M | Billion-dollar training runs by 2027; only well-funded orgs can compete |
| Supply chain concentration | TSMC: 92% of advanced chips; ASML: 100% of EUV machines | Single points of failure create governance opportunities and geopolitical risk |
| Hardware bottlenecks easing | H100 lead times dropped from 11 months (2023) to 8-12 weeks (2024) | Memory (HBM) now the binding constraint through 2027 |
| Compute governance emerging | US: 10²⁶ FLOPs; EU: 10²⁵ FLOPs reporting thresholds | Compute is measurable, concentrated, physical—ideal governance lever |
| Energy demand doubling | Data centers: 415 TWh (2024) → 945 TWh (2030); AI: 35-50% of DC load | Infrastructure growth outpacing electricity supply; nuclear partnerships emerging |
Research Summary
Section titled “Research Summary”Compute has emerged as the most governable input to AI development because it is measurable, concentrated, and physical. Training costs for frontier AI models have grown at 2.4× per year since 2016, with GPT-4 costing approximately $78 million and Gemini Ultra around $191 million—projecting to billion-dollar training runs by 2027. The supply chain exhibits extreme concentration: TSMC manufactures 92% of advanced chips, ASML holds a monopoly on EUV lithography equipment, and NVIDIA controls roughly 80% of the AI accelerator market.
These chokepoints create natural governance leverage. The US, EU, and California have all implemented compute-based regulatory thresholds (10²⁵-10²⁶ FLOPs) that trigger reporting requirements. Hardware bottlenecks have shifted from GPU availability to high-bandwidth memory (HBM), with lead times dropping from 11 months to 8-12 weeks. However, energy constraints are intensifying: AI-driven data center demand is projected to consume 945 TWh by 2030, prompting major tech companies to pursue nuclear partnerships. The concentration of advanced chip manufacturing in Taiwan (TSMC) and lithography equipment in the Netherlands (ASML) creates significant geopolitical risk, driving US efforts to reshore semiconductor production through the CHIPS Act.
Background
Section titled “Background”Compute—the hardware resources required to train and run AI systems—has emerged as perhaps the most tractable lever for AI governance. Unlike algorithms (which can be shared instantly) or data (which is hard to track), compute is measurable (FLOPs, GPU-hours), concentrated (few chokepoints like TSMC, ASML, NVIDIA), and physical (can be tracked, controlled, and metered).
Training frontier models now costs tens to hundreds of millions of dollars in compute alone. Anthropic CEO Dario Amodei has stated that frontier AI developers are likely to spend close to a billion dollars on a single training run in 2025, with up to ten billion-dollar training runs expected in the next two years. This concentration of resources creates natural governance chokepoints.
Key Findings
Section titled “Key Findings”Training Compute Costs: Exponential Growth
Section titled “Training Compute Costs: Exponential Growth”The cost trajectory for training frontier AI models has followed a remarkably consistent exponential trend:
| Model | Year | Training Compute Cost | Notes |
|---|---|---|---|
| GPT-3 | 2020 | $4-12M | Established LLM scaling paradigm |
| GPT-4 | 2023 | $78M | Per Stanford AI Index 2024 |
| Gemini Ultra | 2024 | $191M | Google’s flagship model |
| Projected (2027) | 2027 | >$1B | If 2.4x/year growth continues |
Epoch AI’s analysis found that the amortized hardware and energy cost for the final training run of frontier models has grown at a rate of 2.4x per year since 2016 (95% CI: 2.0x to 3.1x). This rate significantly exceeds Moore’s Law and suggests that “given that total model development costs at the frontier are already over $100 million, these advances may only be accessible to the largest companies and government institutions.”
OpenAI’s 2024 compute spending illustrates the scale: $3 billion on training compute, $1.8 billion on inference compute, and $1 billion on research compute amortized over multiple years (Epoch AI, 2024).
Scaling Laws: From Training to Inference
Section titled “Scaling Laws: From Training to Inference”While traditional scaling laws (Kaplan et al., 2020; Hoffmann et al., 2022/Chinchilla) focused on training compute, recent research has expanded to inference scaling laws:
| Scaling Dimension | Key Finding | Implication |
|---|---|---|
| Training | Performance ∝ compute^α (α ≈ 0.5-0.7) | Predictable capability growth with compute |
| Inference | Test-time compute can be more efficient than parameter scaling | Smaller models + advanced inference may be Pareto-optimal |
| Architecture | MLP-to-attention ratio, GQA affect inference cost | Conditional scaling laws needed for deployment |
| Efficiency | ”Densing law”: capability density doubles every 3.5 months | Same capability with exponentially fewer parameters over time |
The “densing law” published in Nature Machine Intelligence states that capability density (capability per parameter) doubles approximately every 3.5 months, indicating that equivalent model performance can be achieved with exponentially fewer parameters over time. This has significant implications for compute efficiency and deployment costs.
Hardware Supply Chain: Critical Chokepoints
Section titled “Hardware Supply Chain: Critical Chokepoints”The AI compute supply chain exhibits extreme concentration at multiple levels:
TSMC: The Chip Manufacturing Bottleneck
Section titled “TSMC: The Chip Manufacturing Bottleneck”| Metric | Value | Source |
|---|---|---|
| Market share (advanced chips) | 92% | AILAB Blog (2025) |
| Geographic concentration | Single island (Taiwan), 13,976 sq mi | Global Taiwan Institute |
| Economic impact of disruption | $10 trillion (10% of global GDP) | Verdantix (2025) |
| Key customers | Apple, NVIDIA, Qualcomm, Samsung, AMD | Industry analysis |
TSMC’s CoWoS (Chip-on-Wafer-on-Substrate) packaging technology has emerged as the specific bottleneck for AI chips. The production bottleneck lies primarily in TSMC’s CoWoS packaging process, which cannot scale fast enough. Additionally, the TRX5090 substrate—a critical component binding the GPU core to its high-bandwidth memory—is in extremely limited supply, with only a handful of manufacturers in Japan and Taiwan able to produce it at required precision and volume.
TSMC is diversifying with plans for six semiconductor fabs in Arizona, plus expansion in Japan and Germany. However, diversification is not feasible in the short term due to high reshoring costs and talent acquisition challenges (NPR, 2025).
ASML: The EUV Lithography Monopoly
Section titled “ASML: The EUV Lithography Monopoly”| Metric | Value | Source |
|---|---|---|
| EUV market share | 100% | CNBC (2022) |
| Machine cost | $150-400M (High-NA: $370M+) | Industry reports |
| R&D investment | $9B over 30 years | xLight analysis |
| Parts per machine | >100,000 | Technical specifications |
| Suppliers | 800+ globally | Supply chain analysis |
ASML builds 100% of the world’s extreme ultraviolet lithography machines, without which cutting-edge chips are simply impossible to make. The Dutch company is the sole supplier of EUV machines, winning a 30-year race that granted a monopoly on selling the tool essential for fabricating leading-edge semiconductors (Strange VC (2025)).
Potential competition: Pat Gelsinger (ousted Intel CEO) as executive chairman of xLight—a startup founded in 2024—is developing free-electron lasers driven by compact particle accelerators as an alternative to ASML’s laser-produced plasma. The Trump administration injected up to $150 million into xLight from the 2022 CHIPS and Science Act, though this represents early-stage R&D with uncertain timeline (24/7 Wall St., 2025).
NVIDIA and Memory Bottlenecks
Section titled “NVIDIA and Memory Bottlenecks”GPU shortages dominated 2022-2023, but the situation has evolved:
| Period | H100 Lead Time | Binding Constraint |
|---|---|---|
| 2023 (peak shortage) | 11 months | GPU chip supply |
| Early 2024 | 3-4 months | CoWoS packaging |
| Late 2024 | 8-12 weeks | High-bandwidth memory (HBM) |
| 2025-2027 | Variable | Memory supply (SK Hynix, Samsung, Micron) |
NVIDIA’s market dominance is reflected in financials: Q3 fiscal 2026 total revenue reached $57.0 billion, with data center operations accounting for $51.2 billion—90% of the company’s entire business. An NVIDIA H100 AI accelerator sells for $25,000-40,000, giving the company unusual pricing power during shortage periods (BattleforgePC (2025)).
Compute Governance: Regulatory Frameworks
Section titled “Compute Governance: Regulatory Frameworks”Compute thresholds have emerged as a central mechanism for AI governance across multiple jurisdictions:
| Jurisdiction | Threshold | Reporting Requirements | Status |
|---|---|---|---|
| US (EO 14,110) | 10²⁶ FLOPs (10²³ for bio) | Notify government, report security measures, share red-team results | Revoked by Trump EO 14,148; rules proposed |
| EU AI Act | 10²⁵ FLOPs | Notify Commission, perform evaluations, assess systemic risks, report incidents | Active; affects 5-15 companies |
| California (SB 53) | 10²⁶ FLOPs + $500M revenue | Disclose “frontier AI framework,” report catastrophic risk assessments quarterly | Enacted 2024 |
| New York (RAISE Act) | Revenue-based (removed compute threshold) | Report critical incidents within 72 hours | Signed into law 2025 |
Rationale: Training compute can serve as a proxy for the capabilities of AI models. A compute threshold operates as a regulatory trigger, identifying which models might possess more powerful and dangerous capabilities that warrant greater scrutiny (GovAI Research Paper).
Limitations: The debate has centered mostly around a single training compute threshold, but governments could adopt a pluralistic and risk-adjusted approach by introducing multiple compute thresholds that trigger different measures according to degree or nature of risk. Some proposals recommend a tiered approach that would create fewer obligations for models trained on less compute (Institute for Law & AI, 2024).
Export Controls and Geopolitics
Section titled “Export Controls and Geopolitics”The US has implemented increasingly strict export controls on advanced computing chips to China, with significant implications:
| Policy Action | Target | Impact | Assessment |
|---|---|---|---|
| October 2022 | Advanced chips (A100, H100) to China | Created incentive for compute-efficient algorithms | Accelerated Chinese innovation |
| October 2023 | Expanded chip restrictions | Cloud computing loopholes remain | Limited effectiveness |
| December 2024 | High-bandwidth memory (HBM) restrictions | Targets deployment compute for reasoning models | Addresses inference scaling |
| AI Diffusion Framework | Three-tier country system | Byzantine limits for ~150 middle-tier countries | Critiqued as overreach |
Hardware-Enabled Governance Mechanisms (HEMs): RAND researchers introduced the concept of HEMs, which could be installed on chips otherwise prohibited from export. HEMs could provide “some level of confidence that they could not be misused” through technical enforcement (RAND (2024)). However, HEMs face significant threat vectors and would require robust protection measures.
Cloud Computing Alternative: Providing compute as a service offers superior governance opportunities compared to chip export controls. Unlike AI chips accumulated over time, cloud access provides point-in-time computing power that can be restricted or shut off as needed—making it a more precise tool for oversight (Brookings (2024)).
Energy Consumption: The Infrastructure Challenge
Section titled “Energy Consumption: The Infrastructure Challenge”AI’s compute demands translate directly into energy consumption at unprecedented scale:
| Metric | 2024 | 2030 (Projected) | Growth Rate |
|---|---|---|---|
| Global data center electricity | 415 TWh (1.5% of global) | 945 TWh (3% of global) | 15%/year |
| US data center electricity | 183 TWh (4% of US total) | ~320 TWh (8.6% of US total by 2035) | Doubling by 2035 |
| AI share of DC power | 5-15% | 35-50% | AI-specific servers: 30%/year |
| AI-specific servers | 53-76 TWh | 165-326 TWh | 4.3x growth |
Energy sources (US, 2024): Natural gas supplied over 40% of electricity for data centers, renewables (wind/solar) about 24%, nuclear around 20%, and coal around 15% (Pew Research, 2025).
Environmental footprint: Company-wide metrics from environmental disclosures suggest that AI systems may have a carbon footprint equivalent to that of New York City in 2025. In 2023, US data centers directly consumed about 17 billion gallons of water, with hyperscale facilities expected to consume 16-33 billion gallons annually by 2028 (ScienceDirect (2025)).
Infrastructure investment: The data center real estate build-out has reached record levels based on select major hyperscalers’ capital expenditure, trending at roughly $200 billion as of 2024 and estimated to exceed $220B by 2025 (Deloitte (2025)).
Causal Factors
Section titled “Causal Factors”The following factors influence AI compute availability, cost, and governance effectiveness. This analysis is designed to inform future cause-effect diagram creation for the AI Transition Model.
Primary Factors (Strong Influence)
Section titled “Primary Factors (Strong Influence)”| Factor | Direction | Type | Evidence | Confidence |
|---|---|---|---|---|
| Scaling Laws | ↑ Compute Demand | cause | 2.4x/year cost growth; predictable capability returns | High |
| Supply Chain Concentration | ↑ Governance Tractability | intermediate | TSMC 92%, ASML 100% create chokepoints | High |
| Training Cost Escalation | ↓ Actor Diversity | intermediate | $100M+ limits to well-funded orgs; billion-dollar runs by 2027 | High |
| Hardware Bottlenecks | ↓ Capability Growth Rate | leaf | Memory (HBM) shortages through 2027 | High |
| Energy Infrastructure | ↓ Deployment Speed | leaf | 15%/year DC growth exceeds grid capacity in some regions | High |
Secondary Factors (Medium Influence)
Section titled “Secondary Factors (Medium Influence)”| Factor | Direction | Type | Evidence | Confidence |
|---|---|---|---|---|
| Compute Thresholds | ↑ Regulatory Oversight | intermediate | US/EU/CA frameworks at 10²⁵-10²⁶ FLOPs | Medium |
| Export Controls | Mixed Effect | leaf | May accelerate efficient algorithms (DeepSeek example) | Medium |
| Algorithmic Efficiency | ↓ Compute Demand | cause | ”Densing law”: 2x capability density every 3.5 months | Medium |
| Inference Scaling | ↑ Deployment Compute | cause | Test-time compute increasingly important (o1, r1 models) | Medium |
| Geopolitical Tensions | ↑ Supply Risk | leaf | Taiwan Strait conflict would disrupt 92% of advanced chips | Medium |
Minor Factors (Weak Influence)
Section titled “Minor Factors (Weak Influence)”| Factor | Direction | Type | Evidence | Confidence |
|---|---|---|---|---|
| Cloud Governance | ↑ Oversight Precision | intermediate | Point-in-time control vs. chip accumulation | Low |
| Hardware-Enabled Mechanisms | ↑ Export Flexibility | intermediate | RAND proposal; unproven threat model | Low |
| ASML Competition | ↓ Monopoly Risk | leaf | xLight startup; early stage ($150M funding) | Low |
| TSMC Diversification | ↓ Geographic Risk | leaf | Arizona/Japan/Germany fabs; long timeline | Low |
Scenario Variants
Section titled “Scenario Variants”Compute dynamics could evolve along several distinct pathways with different implications for AI safety:
| Scenario | Mechanism | Timeline | Warning Signs | Governance Implications |
|---|---|---|---|---|
| Compute Overhang | Algorithmic breakthroughs make existing compute far more capable | 2-5 years | Efficiency gains exceed hardware scaling; DeepSeek-style innovations | Thresholds become unreliable proxies |
| Hardware Plateau | Physical limits (energy, memory, lithography) constrain scaling | 5-10 years | Slowing Moore’s Law; energy grid bottlenecks | Increased focus on algorithmic safety |
| Geopolitical Disruption | Taiwan conflict disrupts TSMC; China controls advanced chips | 1-10 years | Escalating Taiwan Strait tensions | Western AI capabilities severely degraded |
| Decentralized Compute | Distributed training across many small GPUs becomes viable | 3-8 years | Successful federated learning for frontier models | Governance via chokepoints becomes infeasible |
| Energy Bottleneck | Grid capacity limits data center expansion before compute saturation | 3-7 years | Brownouts near mega-clusters; nuclear partnerships stall | Natural brake on capability growth |
Open Questions
Section titled “Open Questions”| Question | Why It Matters | Current State |
|---|---|---|
| How robust are compute thresholds to algorithmic progress? | DeepSeek achieved competitive results with less than 10²⁵ FLOPs | Thresholds are static; efficiency gains accelerating |
| What is the energy ceiling for AI? | May be binding constraint before chip supply | Projections vary widely; grid capacity unclear |
| Can TSMC diversification succeed in time? | Arizona fabs won’t reach volume until late 2020s | Geopolitical risk timeline uncertain |
| Will inference scaling change governance calculus? | Shifts compute from training (one-time) to deployment (ongoing) | Reasoning models (o1, r1) show importance; December 2024 HBM controls respond |
| How effective are export controls? | May accelerate rather than impede Chinese AI progress | DeepSeek case study suggests efficiency paradox |
| What are the limits of hardware-enabled governance? | Could enable export of otherwise-restricted chips | Threat model unproven; RAND early-stage research |
| Will memory (HBM) remain the bottleneck? | Determines whether GPU shortages return | SK Hynix: shortages through late 2027 |
| Can cloud-based governance scale globally? | More precise than chip controls but requires infrastructure | Loopholes exist; university research concerns |
Sources
Section titled “Sources”Academic Papers & Preprints
Section titled “Academic Papers & Preprints”- Epoch AI (2024). “The rising costs of training frontier AI models” - Comprehensive cost analysis; 2.4x/year growth rate
- arXiv (2024). “Inference Scaling Laws: An Empirical Analysis of Compute-Optimal Inference” - Test-time compute optimization
- arXiv (2024). “Scaling Laws Meet Model Architecture: Toward Inference-Efficient LLMs” - Conditional scaling laws
- Nature Machine Intelligence (2025). “Densing law of LLMs” - Capability density doubling every 3.5 months
- arXiv (2024). “Towards Greater Leverage: Scaling Laws for Efficient Mixture-of-Experts Language Models” - MoE efficiency metrics
Policy & Governance
Section titled “Policy & Governance”- Institute for Law & AI (2024). “The Role of Compute Thresholds for AI Governance” - Comprehensive threshold analysis
- RAND (2024). “Hardware-Enabled Governance Mechanisms” - Technical solutions for export control
- GovAI (2024). “Training Compute Thresholds: Features and Functions in AI Regulation” - Regulatory framework analysis
- Brookings (2024). “The new AI diffusion export control rule will undermine US AI leadership” - Critique of three-tier system
- Brookings (2025). “DeepSeek shows the limits of US export controls on AI chips” - Efficiency paradox analysis
- RAND (2025). “Understanding the Artificial Intelligence Diffusion Framework” - Supply chain controls
Industry Analysis & Data
Section titled “Industry Analysis & Data”- Epoch AI (2024). “How much does it cost to train frontier AI models?” - GPT-4: $78M, Gemini Ultra: $191M
- Epoch AI (2024). “Most of OpenAI’s 2024 compute went to experiments” - $3B training, $1.8B inference
- Tom’s Hardware (2024). “NVIDIA’s H100 AI GPU shortages ease” - Lead time evolution
- AI News (2025). “AI Chip Shortage 2025: What CTOs Learned the Hard Way” - HBM bottleneck through 2027
- CFR (2025). “China’s AI Chip Deficit: Why Huawei Can’t Catch Nvidia” - Huawei 5% of NVIDIA capacity in 2025
Semiconductor Supply Chain
Section titled “Semiconductor Supply Chain”- AILAB Blog (2025). “The $10 Trillion Chokepoint: How One Company Powers the AI Revolution” - TSMC 92% market share
- Tandfonline (2025). “China’s semiconductor conundrum: understanding US export controls” - Taiwan invasion scenarios
- ScienceDirect (2025). “From vulnerabilities to resilience: Taiwan’s semiconductor industry” - $10T disruption cost
- NPR (2025). “As political winds shift, top chipmaker TSMC looks beyond Taiwan” - Arizona/Japan/Germany expansion
- CNBC (2022). “Inside ASML, the company advanced chipmakers use for EUV lithography” - 100% EUV monopoly
- Strange VC (2025). “ASML’s 30-Year Monopoly: The Moonshot Bet No One Can Replicate” - $9B R&D, 30-year race
- 24/7 Wall St. (2025). “Monopoly No More? ASML May Suddenly Have a New Competitor” - xLight startup, $150M CHIPS Act funding
Energy & Infrastructure
Section titled “Energy & Infrastructure”- IEA (2024). “Energy demand from AI” - 415 TWh (2024) → 945 TWh (2030)
- Pew Research (2025). “What we know about energy use at U.S. data centers amid the AI boom” - 183 TWh US (4% of total)
- BloombergNEF (2025). “Power for AI: Easier Said Than Built” - 16 GWh → 49 GWh by 2035
- Deloitte (2025). “As generative AI asks for more power, data centers seek cleaner energy” - $220B capex in 2025
- ScienceDirect (2025). “The carbon and water footprints of data centers and AI” - 17B gallons water (2023), 16-33B (2028)
AI Transition Model Context
Section titled “AI Transition Model Context”Connections to Other Model Elements
Section titled “Connections to Other Model Elements”| Model Element | Relationship | Key Insights |
|---|---|---|
| AI Capabilities (Algorithms) | Complementary | Algorithmic efficiency (densing law) reduces compute requirements; may undermine threshold-based governance |
| AI Capabilities (Adoption) | Enabling | Training costs ($100M+) create barriers to entry; energy infrastructure limits deployment speed |
| AI Ownership (Companies) | Concentrating | Only well-funded organizations can afford frontier models; drives consolidation |
| AI Ownership (Countries) | Geopolitical lever | Export controls and supply chain chokepoints (TSMC, ASML) create interstate competition |
| Misalignment Potential (AI Governance) | Primary intervention point | Compute thresholds enable reporting, evaluations, audits before deployment |
| Misuse Potential | Limiting factor | High training costs reduce rogue actor capabilities (though inference compute different) |
| Transition Turbulence (Racing) | Accelerator | Shortages and strategic competition increase pressure for rapid deployment |
| Civilizational Competence (Governance) | Test case | Effectiveness of compute governance indicates broader governance capacity |
Strategic Implications
Section titled “Strategic Implications”The research reveals several strategic considerations for the AI transition:
-
Governance window closing: If algorithmic efficiency continues to double capability density every 3.5 months (densing law), compute thresholds will become less reliable proxies for risk within 2-3 years. This suggests urgency in establishing complementary governance mechanisms.
-
Energy as natural brake: Infrastructure constraints (15%/year data center growth vs. slower grid expansion) may limit capability growth independent of policy choices. This could provide additional time for governance development but may also increase racing incentives.
-
Geopolitical fragility: 92% concentration in Taiwan creates catastrophic downside risk. TSMC diversification timeline (late 2020s for volume production) may not align with AI capabilities timeline (potentially transformative AI by 2027-2030).
-
Efficiency paradox: Export controls may accelerate development of compute-efficient algorithms, potentially making frontier capabilities accessible to a wider range of actors. This suggests limits to supply-side interventions.
-
Inference shift: Growing importance of deployment/inference compute (o1, r1 reasoning models) changes governance focus from one-time training runs to ongoing operational compute. Cloud-based governance may be more effective for this regime.
The compute landscape is evolving rapidly, with multiple uncertainty dimensions (algorithmic efficiency, hardware bottlenecks, energy constraints, geopolitical shocks) that could significantly alter the AI transition trajectory. Adaptive governance mechanisms that can respond to these shifts will be critical.