Towards General-Purpose Model-Free Reinforcement Learning Paper • 2501.16142 • Published 2 days ago • 18
SRMT: Shared Memory for Multi-agent Lifelong Pathfinding Paper • 2501.13200 • Published 7 days ago • 60
Control LLM: Controlled Evolution for Intelligence Retention in LLM Paper • 2501.10979 • Published 10 days ago • 4
O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruning Paper • 2501.12570 • Published 8 days ago • 20
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published 7 days ago • 260
Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training Paper • 2501.11425 • Published 9 days ago • 84
MangaNinja: Line Art Colorization with Precise Reference Following Paper • 2501.08332 • Published 15 days ago • 55
ChemAgent: Self-updating Library in Large Language Models Improves Chemical Reasoning Paper • 2501.06590 • Published 18 days ago • 8
O1 Replication Journey -- Part 3: Inference-time Scaling for Medical Reasoning Paper • 2501.06458 • Published 19 days ago • 29
The Lessons of Developing Process Reward Models in Mathematical Reasoning Paper • 2501.07301 • Published 16 days ago • 89
Agentless: Demystifying LLM-based Software Engineering Agents Paper • 2407.01489 • Published Jul 1, 2024 • 59
Agent Laboratory: Using LLM Agents as Research Assistants Paper • 2501.04227 • Published 22 days ago • 84
Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though Paper • 2501.04682 • Published 21 days ago • 90
LLM Reasoning Papers Collection Papers to improve reasoning capabilities of LLMs • 20 items • Updated 14 days ago • 106
HumanEval Pro and MBPP Pro: Evaluating Large Language Models on Self-invoking Code Generation Paper • 2412.21199 • Published about 1 month ago • 13