Search, simulation, and science compute

英文原文文件：README.md

设备/设置

CPU/GPU/TPU actor-learner 系统、搜索服务器、模拟器和科学管线，将学习模型与显式探索或优化相结合。

环境模拟、树搜索、自博弈数据生成、长程信用分配和科学结构搜索，比单次前向计算更显著地决定了整体成本。

DQN replay、AlphaGo/AlphaZero/MuZero 搜索、AlphaStar/OpenAI Five 联赛自博弈，以及 AlphaFold 式学习势能或结构预测，让学习适配模拟和科学优化。

纯监督模仿、手工游戏启发式和经典采样管线，在学习模型可大规模引导搜索或模拟后退居次要地位。

排名	年份	论文	优先级	状态
111	2015	Human-level control through deep reinforcement learning	5	downloaded / read_complete
112	2016	Mastering the game of Go with deep neural networks and tree search	5	downloaded / read_complete
113	2017	Mastering the game of Go without human knowledge	5	downloaded / read_complete
114	2021	Highly accurate protein structure prediction with AlphaFold	5	downloaded / read_complete
115	2018	A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play	4	downloaded / read_complete
116	2019	Mastering Atari, Go, chess and shogi by planning with a learned model	4	downloaded / read_complete
117	2024	Accurate structure prediction of biomolecular interactions with AlphaFold 3	4	downloaded / read_complete
118	2019	Grandmaster level in StarCraft II using multi-agent reinforcement learning	3	html_saved_no_open_pdf / read_complete
119	2019	Dota 2 with Large Scale Deep Reinforcement Learning	3	downloaded / read_complete
120	2020	Improved protein structure prediction using potentials from deep learning	3	downloaded / read_complete
127	2025	Gold-medalist Performance in Solving Olympiad Geometry with AlphaGeometry2	4	downloaded / read_complete
128	2025	AlphaEvolve: A coding agent for scientific and algorithmic discovery	5	downloaded / read_complete