-
EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters
Paper • 2402.04252 • Published • 28 -
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models
Paper • 2402.03749 • Published • 13 -
ScreenAI: A Vision-Language Model for UI and Infographics Understanding
Paper • 2402.04615 • Published • 44 -
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss
Paper • 2402.05008 • Published • 23
Collections
Discover the best community collections!
Collections including paper arxiv:2504.02495
-
DreamCraft3D: Hierarchical 3D Generation with Bootstrapped Diffusion Prior
Paper • 2310.16818 • Published • 32 -
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
Paper • 2401.02954 • Published • 47 -
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
Paper • 2401.06066 • Published • 55 -
DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence
Paper • 2401.14196 • Published • 62
-
Inference-Time Scaling for Generalist Reward Modeling
Paper • 2504.02495 • Published • 52 -
Large Language Diffusion Models
Paper • 2502.09992 • Published • 112 -
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model
Paper • 2502.02737 • Published • 224 -
WILDCHAT-50M: A Deep Dive Into the Role of Synthetic Data in Post-Training
Paper • 2501.18511 • Published • 19
-
Natural Language Reinforcement Learning
Paper • 2411.14251 • Published • 31 -
Towards General-Purpose Model-Free Reinforcement Learning
Paper • 2501.16142 • Published • 29 -
Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't
Paper • 2503.16219 • Published • 46 -
Teaching Large Language Models to Reason with Reinforcement Learning
Paper • 2403.04642 • Published • 50
-
RL + Transformer = A General-Purpose Problem Solver
Paper • 2501.14176 • Published • 28 -
Towards General-Purpose Model-Free Reinforcement Learning
Paper • 2501.16142 • Published • 29 -
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training
Paper • 2501.17161 • Published • 120 -
MaxInfoRL: Boosting exploration in reinforcement learning through information gain maximization
Paper • 2412.12098 • Published • 5
-
Competitive Programming with Large Reasoning Models
Paper • 2502.06807 • Published • 70 -
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling
Paper • 2502.06703 • Published • 150 -
Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning
Paper • 2502.06781 • Published • 61 -
LIMO: Less is More for Reasoning
Paper • 2502.03387 • Published • 61