| 87 |
2020 |
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks |
5 |
downloaded / read_complete |
| 88 |
2022 |
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models |
5 |
downloaded / read_complete |
| 89 |
2022 |
Training language models to follow instructions with human feedback |
5 |
downloaded / read_complete |
| 90 |
2022 |
ReAct: Synergizing Reasoning and Acting in Language Models |
5 |
downloaded / read_complete |
| 91 |
2017 |
Deep Reinforcement Learning from Human Preferences |
4 |
downloaded / read_complete |
| 92 |
2020 |
Learning to summarize from human feedback |
4 |
downloaded / read_complete |
| 93 |
2022 |
Self-Consistency Improves Chain of Thought Reasoning in Language Models |
4 |
downloaded / read_complete |
| 94 |
2023 |
Toolformer: Language Models Can Teach Themselves to Use Tools |
4 |
downloaded / read_complete |
| 95 |
2023 |
Let's Verify Step by Step |
4 |
downloaded / read_complete |
| 96 |
2020 |
REALM: Retrieval-Augmented Language Model Pre-Training |
3 |
downloaded / read_complete |
| 97 |
2021 |
WebGPT: Browser-assisted question-answering with human feedback |
3 |
downloaded / read_complete |
| 98 |
2022 |
Constitutional AI: Harmlessness from AI Feedback |
3 |
downloaded / read_complete |
| 99 |
2022 |
Program of Thoughts Prompting: Disentangling Computation from Reasoning for Numerical Reasoning Tasks |
3 |
downloaded / read_complete |
| 100 |
2023 |
Direct Preference Optimization: Your Language Model is Secretly a Reward Model |
3 |
downloaded / read_complete |
| 101 |
2023 |
Tree of Thoughts: Deliberate Problem Solving with Large Language Models |
3 |
downloaded / read_complete |
| 102 |
2023 |
Voyager: An Open-Ended Embodied Agent with Large Language Models |
3 |
downloaded / read_complete |
| 121 |
2025 |
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning |
5 |
downloaded / read_complete |
| 122 |
2025 |
Kimi k1.5: Scaling Reinforcement Learning with LLMs |
5 |
downloaded / read_complete |
| 123 |
2025 |
s1: Simple test-time scaling |
4 |
downloaded / read_complete |
| 131 |
2026 |
Kimi K2.5: Visual Agentic Intelligence |
4 |
downloaded / read_complete |