← All methods

Retrieval-augmented generation

Compute interpretation

Inference-time external-memory pattern that trades retrieval latency and indexing work for grounding and freshness.

Supporting reading cards

Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks (2020, inference_time_compute_post_training)
REALM: Retrieval-Augmented Language Model Pre-Training (2020, inference_time_compute_post_training)
WebGPT: Browser-assisted question-answering with human feedback (2021, inference_time_compute_post_training)

Obsolete or less central under later compute

Track this only through linked reading cards; do not treat this method page as standalone evidence.