Skip to content

Compute & Hardware

📋Page Status
Quality:85 (Comprehensive)
Importance:78.5 (High)
Last edited:2025-12-28 (10 days ago)
Words:3.7k
Backlinks:2
Structure:
📊 11📈 1🔗 78📚 043%Score: 10/15
LLM Summary:Tracks GPU production, training compute growth (4-5x annually), and algorithmic efficiency (doubles every 8 months). Documents that NVIDIA holds 80-90% market share, 30+ models now train at GPT-4 scale (10²⁵ FLOP), and AI power consumption reached 40 TWh in 2024 with projections of 945 TWh by 2030.
DimensionAssessmentEvidence
Data CompletenessHigh for public metricsEpoch AI, IEA reports, company filings
Training Compute Growth4-5x per year since 201030+ models at GPT-4 scale (10²⁵ FLOP) as of mid-2025
Algorithmic EfficiencyDoubles every 8 months (95% CI: 5-14)Epoch AI research on language models
Market ConcentrationNVIDIA holds 80-90% shareData center GPU revenue, CUDA ecosystem lock-in
Energy Trajectory15% annual growth to 2030IEA projects 945 TWh by 2030 (3% of global electricity)
Key ConstraintPackaging (CoWoS) more than wafersHBM supply and advanced packaging limit GPU production
China Gap1-2 node generations behindSMIC 7nm vs. TSMC 3nm/2nm; Huawei yields at 20-50%

Compute and hardware metrics are fundamental to understanding AI progress. The availability of specialized AI chips (especially GPUs), total compute used for training, and efficiency improvements determine what models can be built and how quickly capabilities advance. These metrics also inform regulatory thresholds and help forecast future AI development trajectories.

Loading diagram...

YearH100/H100-EquivalentTotal Data Center GPUsKey Notes
2022approximately 0 (A100 era)2.64MPre-H100, primarily A100s
2023approximately 0.5M3.76MH100 ramp-up begins
2024approximately 2.0Mapproximately 3.0M H100-equivPrimarily H100 and early Hopper
2025 (proj)2M Hopper + 5M Blackwell6.5-7MShift to Blackwell architecture

Customer Orders (2024): Microsoft purchased 485,000 Hopper AI chips—twice the amount bought by Meta (approximately 240,000), according to Statista market data.

Data Quality: Medium-High. Based on Epoch AI estimates, industry reports, and TSMC capacity analysis.

Sources: Epoch AI GPU production tracking, Tom’s Hardware H100 projections

As of mid-2024, Epoch AI estimates approximately 4 million H100-equivalent GPUs (4e21 FLOP/s) deployed globally. This represents cumulative sales of roughly 3 million H100s between 2022-2024, accounting for depreciation.

The stock of computing power from NVIDIA chips has been doubling every 10 months since 2019, with growth accelerating to 2.3x per year.

Major Lab Holdings (End of 2024 estimates):

  • OpenAI: ~250k average, ramping to 460k H100-equivalents by year-end (5% of global supply)
  • Anthropic: ~360k H100-equivalents (4% of global supply), including 400k Amazon Trainium2
  • Google: Largest holder with proprietary TPUs plus GPUs (21% of global AI compute)
  • Meta: 13% of global AI compute share

Data Quality: Medium. Based on cost reports, capacity estimates, and informed analysis from industry observers.

Sources: LessWrong GPU estimates, Epoch AI computing capacity


Training compute for frontier AI models has grown 4-5x per year since 2010, with acceleration to 5x per year since 2020. According to Epoch AI, this growth rate has been consistent across frontier models, large language models, and models from leading companies.

Notable Training Runs:

ModelYearTraining ComputeCost EstimateNotes
GPT-32020approximately 3×10²³ FLOPapproximately $5MFoundation of modern LLMs
GPT-42023approximately 1×10²⁵ FLOP$40-100MFirst model at 10²⁵ scale
GPT-4o2024approximately 3.8×10²⁵ FLOP$100M+Largest documented 2024 model
Gemini 1.0 Ultra2024approximately 2×10²⁵ FLOP$192MMost expensive confirmed training
Llama 3.1 405B2024approximately 1×10²⁵ FLOPapproximately $50M+Trained on 15T tokens
Projected 2027 frontier2027approximately 2×10²⁸ FLOP$1B+1000x GPT-4 scale

Growth in Large-Scale Models (Epoch AI data insights):

  • 2020: Only 2 models trained with greater than 10²³ FLOP
  • 2023: Over 40 models at this scale
  • Mid-2025: Over 30 models trained at greater than 10²⁵ FLOP (GPT-4 scale)
  • By 2028: Projected 165 models at greater than 10²⁵ FLOP; 81 models at greater than 10²⁶ FLOP

Regulatory Thresholds:

  • EU AI Act: 10²⁵ FLOP reporting requirement
  • US Executive Order 14110: 10²⁶ FLOP reporting requirement

Cost Trajectory: The cost of training frontier AI models has grown by a factor of 2-3x per year for the past eight years, suggesting that the largest models will cost over a billion dollars by 2027 (arXiv analysis).

Data Quality: High for published models, Medium-Low for unreleased/future models.

Sources: Epoch AI model database, Our World in Data AI training, Epoch AI tracking


The cost of compute has declined dramatically, outpacing Moore’s Law by ~50x in recent years.

Key Metrics:

  • Overall decline (2019-2025): FP32 FLOP cost decreased ~74% (2025 price = 26% of 2019 price)
  • AI training cost decline: ~10x per year (50x faster than Moore’s Law)
  • GPU price-performance: Doubling every 16 months on frontier chips

Historical Training Cost Examples:

  • ResNet-50 image recognition: $1,000 (2017) → $10 (2019)
  • ImageNet 93% accuracy: Halving every 9 months (2012-2022)
  • GPT-4 equivalent model: $100M (2023) → ~$20M (Q3 2023) → ~$3M (efficiency optimized, 01.ai claim)

GPU Generation Improvements:

  • A100 → H100: 2x price-performance in 16 months
  • Expected trend: ~1.4x per year improvement for frontier chips
  • Google TPU v5p (2025): 30% throughput improvement, 25% lower energy vs v4

Data Quality: High for historical data, Medium for projections.

Sources: Epoch AI training costs, ARK Invest AI training analysis, Our World in Data GPU performance


4. Training Efficiency (Algorithmic Progress)

Section titled “4. Training Efficiency (Algorithmic Progress)”

Algorithmic improvements contribute as much to AI progress as increased compute. According to Epoch AI research, the compute needed to achieve a given performance level has halved roughly every 8 months (95% CI: 5-14 months)—faster than Moore’s Law’s 2-year doubling time.

StudyAnnual Efficiency GainMethodology
Ho et al. 20242.7x (95% CI: 1.8-6.3x)Language model benchmarks
Ho et al. 20256x per yearUpdated methodology
OpenAI 2020approximately 4x per yearImageNet classification
Epoch AI 20243x per year averageCross-benchmark analysis

Key Findings:

  • Doubling time: Algorithms double effective compute every 8 months (95% CI: 5-14 months)
  • Annual improvement rate: 2.7-6x per year in FLOP efficiency depending on methodology
  • Contribution to progress: 35% from algorithmic improvements, 65% from scale (since 2014)

Major Sources of Efficiency Gains (arXiv research): Between 2017 and 2025, 91% of algorithmic progress at frontier scale comes from two innovations:

  1. Switch from LSTM to Transformer architecture
  2. Rebalancing to Chinchilla-optimal scaling

Specific Benchmarks:

  • ImageNet classification: 44x less compute for AlexNet-level performance (2012-2024)
  • Language modeling: Algorithms account for 22,000x improvement on paper (2012-2023)
    • Actual measured innovations account for less than 100x
    • Gap explained by scale-dependent efficiency improvements

Inference Cost Reduction Example:

  • GPT-3.5-equivalent model cost: $20 per million tokens (Nov 2022) to $0.07 per million tokens (Oct 2024)
  • Total reduction: 280x+ in approximately 18 months

Recent Efficiency Breakthroughs:

  • DeepSeek V3: GPT-4o-level performance with fraction of training compute
  • AlphaEvolve: 32.5% speedup for FlashAttention kernel in Transformers

Data Quality: High. Based on rigorous academic research and reproducible benchmarks.

Sources: Epoch AI algorithmic progress, OpenAI efficiency research, ArXiv algorithmic progress paper


According to the IEA Energy and AI Report, data center electricity consumption has grown at 12% per year over the last five years.

Global Data Centers:

  • Total electricity consumption: 415 TWh (1.5% of global electricity)
  • AI-specific consumption: 40 TWh (15% of data center total, up from 2 TWh in 2017)
  • AI share of data center power: 5-15% currently, projected to reach 35-50% by 2030

Regional Breakdown (2024) per IEA analysis:

RegionData Center ConsumptionShare of Global Total
United States183 TWh45%
China104 TWh25%
Europe62 TWh15%
Rest of World66 TWh15%

United States (Pew Research):

  • Data center consumption: 183 TWh (over 4% of US total, equivalent to Pakistan’s annual consumption)
  • Growth: 58 TWh (2014) to 183 TWh (2024)

Global (IEA projections):

  • 2030 projection: 945 TWh (nearly 3% of global electricity)
  • Annual growth rate: 15% per year (2024-2030)—4x faster than total electricity growth
  • AI-optimized data centers: more than 4x growth by 2030

Regional Growth to 2030 (IEA Base Case):

Region20242030 ProjectionIncrease
United States183 TWh423 TWh+130%
China104 TWh279 TWh+170%
Europe62 TWh107+ TWh+70%

Server Type Breakdown:

  • Accelerated servers (AI): 30% annual growth
  • Conventional servers: 9% annual growth

Data Quality: High. Based on IEA, DOE, and industry analyses.

Sources: IEA Energy and AI Report, Pew Research data center energy, DOE data center report


TSMC has committed 28% of its total wafer capacity to AI chip manufacturing. Advanced 3nm and 5nm nodes contribute approximately 74% to overall wafer revenue, and the AI/HPC segment accounts for 59% of total returns (Spark analysis).

3nm Capacity Ramp (WCCFtech):

  • Q3 2025: 3nm at 23% of total revenue (surpassing 5nm)
  • Current production: 100,000-110,000 wafers/month
  • End of 2025 target: 160,000 wafers/month
  • NVIDIA adding 35,000 wafers/month in 3nm alone

2nm Node (N2) Roadmap (WCCFtech):

  • Mass production: Q4 2025
  • End of 2025: 45,000-50,000 wafers/month
  • End of 2026: 100,000 wafers/month
  • 2028: 200,000 wafers/month (including Arizona)
  • Major customers: Apple (50% reserved), Qualcomm; NVIDIA starting 2027

US Expansion (Tom’s Hardware):

  • Arizona Fab 1: 4nm production online (late 2024)
  • Arizona Fab 2: 3nm production starting 2027 (ahead of schedule)
  • Total US investment: $165 billion for three fabs, packaging, and R&D
Node2024 Status2025 Projection2026 Projection
3nm100-110k wpm160k wpmFully booked
2nmRisk production45-50k wpm100k wpm
CoWoS packagingDoubled 2024Doubling againCritical constraint

Current/Near-term:

  • 3nm SF3 (GAA): Available 2025
  • 2nm SF2: Late 2025 start
  • Monthly capacity target: 21k wpm by end of 2026 (163% increase from 2024)

Long-term:

  • Sub-2nm target: 50-100k wpm by 2028
  • Taylor, Texas fab: 93.6% complete (Q3 2024), full completion July 2026

Market Position:

  • Gaining from TSMC capacity constraints
  • Major wins: Tesla AI chips, AMD/Google considering 2nm production
  • 2024 growth: 11% capacity increase
  • 2025 growth: 10% capacity increase (17% for leading-edge with 2nm ramp)
  • 2026 capacity: 12.7M wafers per month
  • Main constraint: Chip packaging (CoWoS) and HBM, not wafer production

Data Quality: High. Based on company reports, industry analysis, and fab construction tracking.

Sources: SEMI fab capacity report, TrendForce Samsung 2nm


Current Understanding (2024):

  • Training vs. Inference split: Currently ~80% training, ~20% inference
  • Projected 2030 split: ~30% training, ~70% inference (reversal)

Lab-Specific Data:

OpenAI (2024):

  • Training compute: $3B amortized cost
  • Inference compute: $1.8B (likely understated for single-year)
  • Research compute: $1B
  • Inference is becoming 15-118x more expensive than training over model lifetime

Historical Inference Ratios:

  • Google (2019-2021): Inference = 60% of total ML compute (three-week snapshots)
  • Inference costs grow continuously after deployment while training is one-time

Utilization Challenges:

  • Packaging bottlenecks (CoWoS)
  • HBM supply constraints
  • Infrastructure development lag

Data Quality: Medium-Low. Most labs don’t publish utilization rates; estimates based on cost reports.

Sources: Epoch AI inference allocation, A&M training demand analysis


Current State:

  • Industry split: 80% training, 20% inference (2024)
  • OpenAI token generation: ~100B tokens/day = 36T tokens/year
  • Training tokens for modern LLMs: ~10T tokens
  • Token cost ratio: Training tokens ~3x more expensive than inference

Evolution:

  • 2019-2021 (Google): 60% inference, 40% training (based on 3-week snapshots)
  • 2024 (Industry): 80% training, 20% inference (during training surge)
  • 2030 (Projected): 70% inference, 30% training (post-surge equilibrium)

Theoretical Optimal Allocation:

  • For roughly equal value per compute in training vs. inference, the tradeoff parameter (α) must be near 1
  • For significantly different allocations (10x difference), α must be below 0.1 or above 10
  • Current industry behavior suggests α close to 1, hence similar magnitudes

Inference Growth Drivers:

  • Deployment at scale requires continuous inference compute
  • One-time training cost vs. ongoing serving costs
  • By 2030, ~70% of data center AI demand projected to be inference

Data Quality: Medium. Based on partial disclosures and theoretical models.

Sources: Epoch AI compute allocation theory, Epoch AI OpenAI compute spend


Initial Training (2023):

  • Official estimate: “More than $100M” (Sam Altman)
  • Epoch AI hardware/energy only: $40M
  • Full cost estimates: $78-192M depending on methodology

GPT-4-Equivalent Training Costs (Optimized):

  • Q3 2023: ~$20M (3x cheaper with efficiency improvements)
  • 01.ai claim: ~$3M using 2,000 GPUs and optimization

Training Cost Growth (Frontier Models):

  • Historical trend: Tripling per year (4x compute growth, 1.3x efficiency gain)
  • If trend continues: $1B+ training runs by 2027
  • Dario Amodei (Aug 2024): “$1B models this year, $10B models by 2025”

Cost Decline (Equivalent Performance):

  • Algorithmic efficiency: 2x every 9 months
  • Hardware efficiency: 1.4x per year
  • Combined: ~10x cost reduction per year for equivalent capability

Optimistic Scenario (Efficiency improvements continue):

  • 2023: $20M (optimized)
  • 2024: $2M (10x reduction)
  • 2025: $200k (10x reduction)
  • 2026: under $100k (below $1M threshold)

Conservative Scenario (Slower efficiency gains):

  • Assume 3x annual reduction instead of 10x
  • 2023: $20M
  • 2025: $2.2M
  • 2027: $240k (below $1M threshold)

Important Notes:

  • These projections are for achieving GPT-4-level performance, not frontier capabilities
  • Frontier models will continue to cost $100M-$1B+ as labs push boundaries
  • The trend is divergent: equivalent performance gets cheaper while cutting-edge gets more expensive

Data Quality: Medium. Based on historical trends and partial cost disclosures.

Sources: Juma GPT-4 cost breakdown, Fortune AI training costs, ArXiv training costs


10. Nvidia’s AI Accelerator Market Share

Section titled “10. Nvidia’s AI Accelerator Market Share”

Current Market Position (2024-2025) (Statista, Fortune Business Insights):

  • Dominant share: 80-95% of AI accelerator market
  • Conservative estimates: 70-86%
  • Most commonly cited: 80-90%

Market Size (Grand View Research):

  • 2024: $14.48B data center GPU market
  • 2032 projected: $295B (13.5% CAGR)
  • Alternative estimate (Precedence Research): $192B by 2034

Nvidia Revenue (Statista):

  • FY 2024 data center revenue: $47.5B (216% YoY increase)
  • Q3 2025 data center revenue: $30.8B (112% YoY)
  • Data center share: 87% of total segment revenue

Competitive Landscape:

Company2025 Market ShareKey ProductsNotes
Nvidia80-90%H100, H200, BlackwellCUDA lock-in, dominant position
AMDapproximately 8-10%MI300 series$5.6B projected (2025), doubling DC footprint
Intelapproximately 8%Gaudi 38.7% of training accelerators by end 2025
GoogleInternal useTPU v5p$3.1B value (2025), custom deployment

Nvidia’s Competitive Advantages:

  1. CUDA ecosystem: Deep software integration, high switching costs
  2. Performance leadership: H100/H200 industry standard
  3. Supply relationships: Preferential TSMC access
  4. First-mover advantage: Established during AI boom

Emerging Threats:

  • Custom silicon (Google TPU, Amazon Trainium)
  • Meta considering shift from CUDA to TPU (billions in spending)
  • JAX job postings grew 340% vs. CUDA 12% (Jan 2025)
  • Inference workloads bleeding to ASICs

Data Quality: High. Based on market research firms and financial disclosures.

Sources: PatentPC AI chip market stats, TechInsights Q1 2024, CNBC Nvidia market analysis


11. China’s Domestic AI Chip Production Capacity

Section titled “11. China’s Domestic AI Chip Production Capacity”

SMIC (Semiconductor Manufacturing International Corporation) (Tom’s Hardware):

  • Current 7nm capacity: approximately 30k wafers per month (wpm)
  • 2025 target: 45-50k wpm advanced nodes
  • 2026 projection: 60k wpm
  • 2027 projection: 80k wpm (with yields potentially reaching 70%)
  • Plans to double 7nm capacity in 2025 (most advanced process in mass production in China)

Huawei Ascend AI Chips (SemiAnalysis, Bloomberg):

Metric20242025 (Projected)2026 (Projected)
Dies produced507k (mostly 910B)805k-1.5M1.2M+ (Q4 alone)
Packaged chips shippedapproximately 200k600-700kapproximately 600k (910C)
Yield rate (910C)approximately 20-30%Improving to 70% target
Technology nodeSMIC 7nm (DUV)SMIC N+2Continued DUV

Production Bottlenecks (SemiAnalysis):

  1. HBM (High-Bandwidth Memory) - Critical constraint:

    • Huawei’s stockpile: 11.7M HBM stacks (7M from Samsung pre-restrictions)
    • Stockpile depletion: Expected end of 2025
    • CXMT domestic production: approximately 2M stacks in 2026 (supports only 250-400k chips)
  2. Yield challenges (TrendForce):

    • Ascend 910C yield: approximately 20-30% (on older stockpiled equipment)
    • Ascend 910B yield: approximately 50%
    • Low yields force production cuts and order delays
    • Without EUV, advanced packaging, and unrestricted HBM access, chips remain constrained
  3. TSMC die bank:

    • Huawei received 2.9M+ Ascend dies from TSMC (pre-sanctions)
    • This stockpile enables 2024-2025 production
    • Without die bank, production would be much lower

Huawei Fab Buildout:

  • Dedicated AI chip facility: End of 2025
  • Additional sites: 2 more in 2026
  • WFE (wafer fab equipment) spending: $7.3B (2024, up 27% YoY)
  • Global ranking: 4th largest WFE customer (from zero in 2022)

Production Ramp Timeline:

  • Q3 2024: Ascend 910B production ramp begins
  • Q1 2025: Ascend 910C mass production starts (on SMIC N+2 process)
  • 2025-2026: Continued ramp, constrained by HBM

Huawei vs. Nvidia (Tom’s Hardware analysis):

  • Huawei ecosystem scaling up but lags significantly on efficiency and performance
  • Technology node: 7nm (Huawei/SMIC) vs. 4nm/3nm (Nvidia/TSMC)
  • Memory bottleneck: Ascend chips cannot match NVIDIA’s HBM subsystem
  • Export controls successfully limiting China’s access to cutting-edge AI chips
  • Gap expected to persist due to continued US restrictions

Data Quality: Medium. Based on industry analysis, supply chain reports, and informed estimates.

Sources: Tom’s Hardware China AI chip production, SemiAnalysis Huawei production, WCCFtech Huawei capacity


Historical Peak Lead Times (2022): During the chip shortage peak:

  • ArF immersion equipment: 24 months
  • EUV equipment: 18 months
  • I-line equipment: 18 months
  • Industry average (all equipment): 14 months (up from 3-6 months pre-shortage)

Current State (2024-2025):

  • Lead times have moderated from 2022 peak but remain “incredibly long”
  • Foundries must plan capacity expansions well in advance
  • Exact current lead times not publicly disclosed

ASML Production Capacity Targets:

Equipment Type2025 TargetMedium-term Target
EUV 0.33 NA90 systems/yearMaintained
DUV (immersion + dry)600 systems/yearMaintained
EUV High-NA (0.55 NA)-~20 systems/year

2024 Shipments (Actual):

  • Total lithography: 418 systems
  • EUV: 44 systems
  • DUV: 374 systems
  • Metrology/inspection: 165 systems

High-NA EUV Systems:

  • Cost: $400M+ per system (vs. $200M for low-NA)
  • First commercial deployment: Intel TWINSCAN EXE:5200B
  • Status: Transition from low-NA to high-NA beginning 2024-2025

ASML Market Dominance:

  • Lithography equipment market share: ~94% (2024)
  • Remaining 6%: Canon and Nikon
  • Monopoly on EUV lithography (only supplier globally)

China Export Restrictions:

  • ASML expects China customer demand to decline significantly in 2026 vs. 2024-2025
  • However, total 2026 net sales not expected to fall below 2025 levels (non-China growth compensates)

China’s EUV Development:

  • Reports of prototype EUV lithography machine development
  • Target: AI chip output by 2028 using domestic EUV
  • Status: Early prototype, far from production capability

Lead Time Implications:

  • Long lead times favor incumbents with existing allocations
  • New entrants (especially geopolitically restricted) face multi-year delays
  • Supply constraints on advanced packaging (CoWoS) now more critical than lithography

Data Quality: Medium-High. Based on ASML reports and industry analysis.

Sources: SMM ASML lead times, TrendForce ASML EUV analysis, Tom’s Hardware ASML capacity


MetricData QualityUpdate FrequencyKey Gaps
GPU ProductionMedium-HighQuarterlyExact production numbers proprietary
Training ComputeHigh (public models)OngoingUnreleased model estimates uncertain
Cost per FLOPHighAnnualFuture projections uncertain
Training EfficiencyHighAnnualContribution breakdown debated
Data Center PowerHighAnnualAI-specific breakdown incomplete
Fab CapacityHighQuarterlyPackaging/HBM constraints harder to track
GPU UtilizationLowRareMost labs don’t disclose
Inference/Training RatioMediumRareIndustry-wide data sparse
Cost ProjectionsMediumN/ADepends on uncertain trends
Nvidia Market ShareHighQuarterlyCustom silicon market opaque
China ProductionMediumQuarterlyTrue yields/capacity uncertain
Equipment Lead TimesMediumAnnualReal-time data proprietary

The actual contribution of algorithmic improvements vs. scale-dependent effects remains debated. Measured innovations account for less than 100x of the claimed 22,000x improvement, with the gap attributed to scaling effects that are harder to isolate.

Whether inference will truly dominate by 2030 depends on:

  • Rate of model deployment at scale
  • Efficiency improvements in inference
  • Whether training runs continue to grow exponentially

Estimates of China’s domestic chip production vary widely (200k to 1.5M dies) due to:

  • Yield rate uncertainty
  • HBM supply constraints
  • Stockpile utilization vs. new production
  • Lack of independent verification

Major labs don’t disclose actual utilization rates, training efficiency, or infrastructure bottlenecks. The 80/20 training/inference split is an industry estimate, not measured data.


This page synthesizes data from:

Primary Sources:

Industry Analysis:

Research:

Market Research:

Last updated: December 2025