O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruning Paper • 2501.12570 • Published 6 days ago • 20
Test-Time Preference Optimization: On-the-Fly Alignment via Iterative Textual Feedback Paper • 2501.12895 • Published 5 days ago • 48
VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding Paper • 2501.13106 • Published 5 days ago • 69
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published 5 days ago • 216
Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training Paper • 2501.11425 • Published 7 days ago • 77
Towards Best Practices for Open Datasets for LLM Training Paper • 2501.08365 • Published 13 days ago • 49
Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models Paper • 2501.09686 • Published 11 days ago • 35
Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps Paper • 2501.09732 • Published 11 days ago • 65
PaSa: An LLM Agent for Comprehensive Academic Paper Search Paper • 2501.10120 • Published 10 days ago • 38
MMDocIR: Benchmarking Multi-Modal Retrieval for Long Documents Paper • 2501.08828 • Published 12 days ago • 28
O1 Replication Journey -- Part 3: Inference-time Scaling for Medical Reasoning Paper • 2501.06458 • Published 17 days ago • 29
LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs Paper • 2501.06186 • Published 17 days ago • 59
MiniMax-01: Scaling Foundation Models with Lightning Attention Paper • 2501.08313 • Published 13 days ago • 268
Multi-task retriever fine-tuning for domain-specific and efficient RAG Paper • 2501.04652 • Published 19 days ago • 10
Search-o1: Agentic Search-Enhanced Large Reasoning Models Paper • 2501.05366 • Published 18 days ago • 80
Agent Laboratory: Using LLM Agents as Research Assistants Paper • 2501.04227 • Published 20 days ago • 81