GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning Paper • 2504.00891 • Published 18 days ago • 12 • 3
GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning Paper • 2504.00891 • Published 18 days ago • 12
Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems Paper • 2504.01990 • Published 18 days ago • 242
Inference-Time Scaling for Generalist Reward Modeling Paper • 2504.02495 • Published 16 days ago • 52 • 4
Inference-Time Scaling for Generalist Reward Modeling Paper • 2504.02495 • Published 16 days ago • 52
Improved Visual-Spatial Reasoning via R1-Zero-Like Training Paper • 2504.00883 • Published 18 days ago • 60
A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, and Beyond Paper • 2503.21614 • Published 23 days ago • 39 • 4
A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, and Beyond Paper • 2503.21614 • Published 23 days ago • 39
Effectively Controlling Reasoning Models through Thinking Intervention Paper • 2503.24370 • Published 18 days ago • 18 • 4
OThink-MR1: Stimulating multimodal generalized reasoning capabilities via dynamic reinforcement learning Paper • 2503.16081 • Published 30 days ago • 26 • 3
OThink-MR1: Stimulating multimodal generalized reasoning capabilities via dynamic reinforcement learning Paper • 2503.16081 • Published 30 days ago • 26
Think Before Recommend: Unleashing the Latent Reasoning Power for Sequential Recommendation Paper • 2503.22675 • Published 21 days ago • 34
Effectively Controlling Reasoning Models through Thinking Intervention Paper • 2503.24370 • Published 18 days ago • 18
Efficient Inference for Large Reasoning Models: A Survey Paper • 2503.23077 • Published 21 days ago • 45 • 3
Efficient Inference for Large Reasoning Models: A Survey Paper • 2503.23077 • Published 21 days ago • 45
Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model Paper • 2503.24290 • Published 19 days ago • 61 • 3
Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model Paper • 2503.24290 • Published 19 days ago • 61
AdaptiVocab: Enhancing LLM Efficiency in Focused Domains through Lightweight Vocabulary Adaptation Paper • 2503.19693 • Published 25 days ago • 75
Exploring Data Scaling Trends and Effects in Reinforcement Learning from Human Feedback Paper • 2503.22230 • Published 22 days ago • 43