Compute Regimes

Ten stages of AI compute

Each regime defines a specific hardware environment: device, memory, interconnect, and data-pipeline constraints that shaped which AI methods could be trained, scaled, or deployed.

10 regimes

132 papers total

1986–2026 years covered

01 8 papers

Pre-2012 CPU and statistical foundations

CPU-centric training, small datasets, hand-engineered features — the prerequisite measurement and optimization culture.

View details →

02 12 papers

Single-GPU deep learning

Commodity GPUs make high-throughput dense tensor training practical. CNNs, dropout, batch normalization become dominant.

View details →

03 12 papers

Multi-GPU dense training

The bottleneck shifts to synchronization, batch size, depth, and memory stability across multiple GPUs.

View details →

04 12 papers

TPU and accelerator Transformer era

Accelerators reward large matrix multiplies and sequence batching. Transformers, BERT, and T5 fit this structure.

View details →

05 19 papers

Hyperscale dense LLM training

Training becomes a datacenter-scale problem: model/data parallelism, optimizer state sharding, compute-optimal scaling.

View details →

06 14 papers

Sparse and memory-efficient scaling

Memory, activation cost, and communication pressure drive MoE, attention kernels, sharding, and recomputation.

View details →

07 13 papers

Generative media compute

Image and video generation depend on GPU throughput, denoising iteration cost, and latent-space compression.

View details →

08 20 papers

Inference-time compute and post-training

The frontier shifts to inference allocation: RLHF, chain-of-thought, verifiers, retrieval, tools, and agents.

View details →

09 10 papers

Efficient and edge inference

Deployment constraints dominate: latency, memory footprint, quantization, adapter size, KV-cache pressure.

View details →

10 12 papers

Search, simulation, and science compute

Search, simulation, self-play, and scientific structure prediction combine neural networks with structured inference.

View details →

Read the full compute spine narrative