SGD and stochastic optimizers

Compute interpretation

Optimization style that trades exact gradients for scalable noisy updates and becomes central as datasets and models outgrow full-batch training.

Supporting reading cards

Obsolete or less central under later compute

Track this only through linked reading cards; do not treat this method page as standalone evidence.