Compute-Structure AI History

AI history through
hardware constraints

“How did available hardware, memory, interconnects, data pipelines, and inference setups make some AI methods practical, widely adopted, or obsolete?”

This project reads AI history through the hardware that made it possible. From 1986 to 2026, 132 research papers are organized into 10 compute regimes — each defined by device constraints, memory limits, and interconnect constraints that shaped which methods could be trained, scaled, or deployed.

132

Reading cards

10

Compute regimes

1986–2026

Timespan

25

Method notes

Compute Narrative

Ten compute regimes

Each regime marks a period when hardware, memory, and interconnect constraints favored a different set of AI methods.

01

Pre-2012 CPU and statistical foundations

CPU-centric training, small datasets, hand-engineered features — the prerequisite measurement and optimization culture.

8 papers
02

Single-GPU deep learning

Commodity GPUs make high-throughput dense tensor training practical. CNNs, dropout, batch normalization become dominant.

12 papers
03

Multi-GPU dense training

The bottleneck shifts to synchronization, batch size, depth, and memory stability across multiple GPUs.

12 papers
04

TPU and accelerator Transformer era

Accelerators reward large matrix multiplies and sequence batching. Transformers, BERT, and T5 fit this structure.

12 papers
05

Hyperscale dense LLM training

Training becomes a datacenter-scale problem: model/data parallelism, optimizer state sharding, compute-optimal scaling.

19 papers
06

Sparse and memory-efficient scaling

Memory, activation cost, and communication pressure drive MoE, attention kernels, sharding, and recomputation.

14 papers
07

Generative media compute

Image and video generation depend on GPU throughput, denoising iteration cost, and latent-space compression.

13 papers
08

Inference-time compute and post-training

The frontier shifts to inference allocation: RLHF, chain-of-thought, verifiers, retrieval, tools, and agents.

20 papers
09

Efficient and edge inference

Deployment constraints dominate: latency, memory footprint, quantization, adapter size, KV-cache pressure.

10 papers
10

Search, simulation, and science compute

Search, simulation, self-play, and scientific structure prediction combine neural networks with structured inference.

12 papers