DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper ā¢ 2501.12948 ā¢ Published 4 days ago ā¢ 187
Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training Paper ā¢ 2501.11425 ā¢ Published 6 days ago ā¢ 74
Condor: Enhance LLM Alignment with Knowledge-Driven Data Synthesis and Refinement Paper ā¢ 2501.12273 ā¢ Published 5 days ago ā¢ 14
view article Article Yay! Organizations can now publish blog Articles By huggingface ā¢ 6 days ago ā¢ 29
DeepSeek R1 (All Versions) Collection DeepSeek R1 - the most powerful reasoning open-source model - available in GGUF, original & 4-bit formats. Includes Llama & Qwen distilled models. ā¢ 27 items ā¢ Updated 1 day ago ā¢ 62
Jan 17 Releases āļø Collection Models and datasets of the second week of Jan 2025. ā¢ 23 items ā¢ Updated 9 days ago ā¢ 10
Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps Paper ā¢ 2501.09732 ā¢ Published 10 days ago ā¢ 65
OmniThink: Expanding Knowledge Boundaries in Machine Writing through Thinking Paper ā¢ 2501.09751 ā¢ Published 10 days ago ā¢ 46
MMDocIR: Benchmarking Multi-Modal Retrieval for Long Documents Paper ā¢ 2501.08828 ā¢ Published 11 days ago ā¢ 28
RepVideo: Rethinking Cross-Layer Representation for Video Generation Paper ā¢ 2501.08994 ā¢ Published 11 days ago ā¢ 15
MiniMax-01: Scaling Foundation Models with Lightning Attention Paper ā¢ 2501.08313 ā¢ Published 12 days ago ā¢ 268
The Lessons of Developing Process Reward Models in Mathematical Reasoning Paper ā¢ 2501.07301 ā¢ Published 13 days ago ā¢ 86
Multiagent Finetuning: Self Improvement with Diverse Reasoning Chains Paper ā¢ 2501.05707 ā¢ Published 16 days ago ā¢ 19
LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs Paper ā¢ 2501.06186 ā¢ Published 16 days ago ā¢ 59
VideoRAG: Retrieval-Augmented Generation over Video Corpus Paper ā¢ 2501.05874 ā¢ Published 16 days ago ā¢ 66
Enhancing Human-Like Responses in Large Language Models Paper ā¢ 2501.05032 ā¢ Published 17 days ago ā¢ 49