Compute-Structure AI History

AI history through
hardware constraints

“How did available hardware, memory, interconnects, data pipelines, and inference setups make some AI methods practical, widely adopted, or obsolete?”

This project reads AI history through the hardware that made it possible. From 1986 to 2026, 132 research papers are organized into 10 compute regimes — each defined by device constraints, memory limits, and interconnect constraints that shaped which methods could be trained, scaled, or deployed.

132

Reading cards

Compute regimes

1986–2026

Timespan

Method notes

Read the Compute Narrative Browse Papers Accelerator Timeline

Compute Narrative

Ten compute regimes

Each regime marks a period when hardware, memory, and interconnect constraints favored a different set of AI methods.

Pre-2012 CPU and statistical foundations

CPU-centric training, small datasets, hand-engineered features — the prerequisite measurement and optimization culture.

8 papers

Single-GPU deep learning

Commodity GPUs make high-throughput dense tensor training practical. CNNs, dropout, batch normalization become dominant.

12 papers

Multi-GPU dense training

The bottleneck shifts to synchronization, batch size, depth, and memory stability across multiple GPUs.

12 papers

TPU and accelerator Transformer era

Accelerators reward large matrix multiplies and sequence batching. Transformers, BERT, and T5 fit this structure.

12 papers

Hyperscale dense LLM training

Training becomes a datacenter-scale problem: model/data parallelism, optimizer state sharding, compute-optimal scaling.

19 papers

Sparse and memory-efficient scaling

Memory, activation cost, and communication pressure drive MoE, attention kernels, sharding, and recomputation.

14 papers

Generative media compute

Image and video generation depend on GPU throughput, denoising iteration cost, and latent-space compression.

13 papers

Inference-time compute and post-training

The frontier shifts to inference allocation: RLHF, chain-of-thought, verifiers, retrieval, tools, and agents.

20 papers

Efficient and edge inference

Deployment constraints dominate: latency, memory footprint, quantization, adapter size, KV-cache pressure.

10 papers

Search, simulation, and science compute

Search, simulation, self-play, and scientific structure prediction combine neural networks with structured inference.

12 papers

Read the full narrative

AI history through hardware constraints

Ten compute regimes

Pre-2012 CPU and statistical foundations

Single-GPU deep learning

Multi-GPU dense training

TPU and accelerator Transformer era

Hyperscale dense LLM training

Sparse and memory-efficient scaling

Generative media compute

Inference-time compute and post-training

Efficient and edge inference

Search, simulation, and science compute

AI history through
hardware constraints