-
Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via Collective Monte Carlo Tree Search
Paper • 2412.18319 • Published • 37 -
Token-Budget-Aware LLM Reasoning
Paper • 2412.18547 • Published • 45 -
Efficiently Serving LLM Reasoning Programs with Certaindex
Paper • 2412.20993 • Published • 35 -
B-STaR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners
Paper • 2412.17256 • Published • 45
Collections
Discover the best community collections!
Collections including paper arxiv:2501.12948
-
Same Task, More Tokens: the Impact of Input Length on the Reasoning Performance of Large Language Models
Paper • 2402.14848 • Published • 18 -
Teaching Large Language Models to Reason with Reinforcement Learning
Paper • 2403.04642 • Published • 46 -
How Far Are We from Intelligent Visual Deductive Reasoning?
Paper • 2403.04732 • Published • 20 -
Learning to Reason and Memorize with Self-Notes
Paper • 2305.00833 • Published • 5
-
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
Paper • 2501.12948 • Published • 225 -
Evolving Deeper LLM Thinking
Paper • 2501.09891 • Published • 98 -
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking
Paper • 2501.04519 • Published • 249