Inference-time reasoning
英文原文文件:inference_time_reasoning.md
计算解释
在预训练后通过提示与采样策略投入额外推理阶段计算,而非修改模型权重的方法。
支撑阅读卡
- Chain-of-Thought Prompting Elicits Reasoning in Large Language Models (2022,
inference_time_compute_post_training) - ReAct: Synergizing Reasoning and Acting in Language Models (2022,
inference_time_compute_post_training) - Self-Consistency Improves Chain of Thought Reasoning in Language Models (2022,
inference_time_compute_post_training) - Program of Thoughts Prompting: Disentangling Computation from Reasoning for Numerical Reasoning Tasks (2022,
inference_time_compute_post_training) - Let's Verify Step by Step (2023,
inference_time_compute_post_training) - Tree of Thoughts: Deliberate Problem Solving with Large Language Models (2023,
inference_time_compute_post_training) - DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning (2025,
inference_time_compute_post_training) - Kimi k1.5: Scaling Reinforcement Learning with LLMs (2025,
inference_time_compute_post_training) - s1: Simple test-time scaling (2025,
inference_time_compute_post_training) - Qwen3 Technical Report (2025,
hyperscale_dense_llm_training) - Gold-medalist Performance in Solving Olympiad Geometry with AlphaGeometry2 (2025,
search_simulation_science_compute) - AlphaEvolve: A coding agent for scientific and algorithmic discovery (2025,
search_simulation_science_compute) - DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models (2025,
sparse_memory_efficient_scaling) - Kimi K2.5: Visual Agentic Intelligence (2026,
inference_time_compute_post_training)
后续计算范式下过时或退居次要的内容
仅通过已链接的阅读卡追踪,不将本方法页视为独立证据来源。