SpecReason: Fast and Accurate Inference-Time Compute via Speculative Reasoning Paper • 2504.07891 • Published 9 days ago • 4
AI-University: An LLM-based platform for instructional alignment to scientific classrooms Paper • 2504.08846 • Published 8 days ago • 6
VisualPuzzles: Decoupling Multimodal Reasoning Evaluation from Domain Knowledge Paper • 2504.10342 • Published 5 days ago • 9
A Minimalist Approach to LLM Reasoning: from Rejection Sampling to Reinforce Paper • 2504.11343 • Published 4 days ago • 9
Efficient Hybrid Language Model Compression through Group-Aware SSM Pruning Paper • 2504.11409 • Published 3 days ago • 9
ReZero: Enhancing LLM search ability by trying one-more-time Paper • 2504.11001 • Published 4 days ago • 9
RealHarm: A Collection of Real-World Language Model Application Failures Paper • 2504.10277 • Published 5 days ago • 10
DeepMath-103K: A Large-Scale, Challenging, Decontaminated, and Verifiable Mathematical Dataset for Advancing Reasoning Paper • 2504.11456 • Published 3 days ago • 10
Efficient Process Reward Model Training via Active Learning Paper • 2504.10559 • Published 5 days ago • 11
DataDecide: How to Predict Best Pretraining Data with Small Experiments Paper • 2504.11393 • Published 4 days ago • 13
Heimdall: test-time scaling on the generative verification Paper • 2504.10337 • Published 5 days ago • 29
How Instruction and Reasoning Data shape Post-Training: Data Quality through the Lens of Layer-wise Gradients Paper • 2504.10766 • Published 4 days ago • 36
Genius: A Generalizable and Purely Unsupervised Self-Training Framework For Advanced Reasoning Paper • 2504.08672 • Published 8 days ago • 50
xVerify: Efficient Answer Verifier for Reasoning Model Evaluations Paper • 2504.10481 • Published 4 days ago • 77
SFT or RL? An Early Investigation into Training R1-Like Reasoning Large Vision-Language Models Paper • 2504.11468 • Published 9 days ago • 19
AlayaDB: The Data Foundation for Efficient and Effective Long-context LLM Inference Paper • 2504.10326 • Published 5 days ago • 22
ReTool: Reinforcement Learning for Strategic Tool Use in LLMs Paper • 2504.11536 • Published 3 days ago • 43