SWE-Fixer: Training Open-Source LLMs for Effective and Efficient GitHub Issue Resolution Paper • 2501.05040 • Published 21 days ago • 15
The GAN is dead; long live the GAN! A Modern GAN Baseline Paper • 2501.05441 • Published 20 days ago • 87
Sigma: Differential Rescaling of Query, Key and Value for Efficient Language Models Paper • 2501.13629 • Published 6 days ago • 40
SRMT: Shared Memory for Multi-agent Lifelong Pathfinding Paper • 2501.13200 • Published 7 days ago • 60
O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruning Paper • 2501.12570 • Published 8 days ago • 20
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published 7 days ago • 261
Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training Paper • 2501.11425 • Published 9 days ago • 84
MMVU: Measuring Expert-Level Multi-Discipline Video Understanding Paper • 2501.12380 • Published 8 days ago • 79
Unsloth 4-bit Dynamic Quants Collection Unsloths Dynamic 4bit Quants selectively skips quantizing certain parameters; greatly improving accuracy while only using <10% more VRAM than BnB 4bit • 13 items • Updated 5 days ago • 25
DeepSeek R1 (All Versions) Collection DeepSeek R1 - the most powerful reasoning open-source model - available in GGUF, original & 4-bit formats. Includes Llama & Qwen distilled models. • 27 items • Updated 3 days ago • 101
Search-o1: Agentic Search-Enhanced Large Reasoning Models Paper • 2501.05366 • Published 20 days ago • 83