Token-Efficient Long Video Understanding for Multimodal LLMs Paper • 2503.04130 • Published 4 days ago • 65
Dedicated Feedback and Edit Models Empower Inference-Time Scaling for Open-Ended General-Domain Tasks Paper • 2503.04378 • Published 4 days ago • 6
FuseChat-3.0: Preference Optimization Meets Heterogeneous Model Fusion Paper • 2503.04222 • Published 4 days ago • 12
LINGOLY-TOO: Disentangling Memorisation from Reasoning with Linguistic Templatisation and Orthographic Obfuscation Paper • 2503.02972 • Published 5 days ago • 23
Can Large Language Models Detect Errors in Long Chain-of-Thought Reasoning? Paper • 2502.19361 • Published 11 days ago • 26
Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems Paper • 2502.19328 • Published 11 days ago • 21
Rank1: Test-Time Compute for Reranking in Information Retrieval Paper • 2502.18418 • Published 12 days ago • 25
CODESYNC: Synchronizing Large Language Models with Dynamic Code Evolution at Scale Paper • 2502.16645 • Published 14 days ago • 21
FINEREASON: Evaluating and Improving LLMs' Deliberate Reasoning through Reflective Puzzle Solving Paper • 2502.20238 • Published 10 days ago • 24
R2-T2: Re-Routing in Test-Time for Multimodal Mixture-of-Experts Paper • 2502.20395 • Published 10 days ago • 43
SoS1: O1 and R1-Like Reasoning LLMs are Sum-of-Square Solvers Paper • 2502.20545 • Published 10 days ago • 20
Multi-Turn Code Generation Through Single-Step Rewards Paper • 2502.20380 • Published 10 days ago • 29
Predictive Data Selection: The Data That Predicts Is the Data That Teaches Paper • 2503.00808 • Published 8 days ago • 51
KodCode: A Diverse, Challenging, and Verifiable Synthetic Dataset for Coding Paper • 2503.02951 • Published 5 days ago • 25
IterPref: Focal Preference Learning for Code Generation via Iterative Debugging Paper • 2503.02783 • Published 5 days ago • 5