Token-Efficient Long Video Understanding for Multimodal LLMs Paper • 2503.04130 • Published 3 days ago • 61
FuseChat-3.0: Preference Optimization Meets Heterogeneous Model Fusion Paper • 2503.04222 • Published 3 days ago • 12
HoT: Highlighted Chain of Thought for Referencing Supporting Facts from Inputs Paper • 2503.02003 • Published 6 days ago • 37
view article Article Hugging Face and JFrog partner to make AI Security more transparent 6 days ago • 18
C4AI Aya Vision Collection Aya Vision is a state-of-the-art family of vision models that brings multimodal capabilities to 23 languages. • 5 items • Updated 5 days ago • 59
DeepSolution: Boosting Complex Engineering Solution Design via Tree-based Exploration and Bi-point Thinking Paper • 2502.20730 • Published 9 days ago • 32
CODESYNC: Synchronizing Large Language Models with Dynamic Code Evolution at Scale Paper • 2502.16645 • Published 14 days ago • 21
R2-T2: Re-Routing in Test-Time for Multimodal Mixture-of-Experts Paper • 2502.20395 • Published 10 days ago • 43
Phi-4 Collection Phi-4 family of small language and multi-modal models. • 7 items • Updated 6 days ago • 108
Distill Any Depth: Distillation Creates a Stronger Monocular Depth Estimator Paper • 2502.19204 • Published 11 days ago • 11
GHOST 2.0: generative high-fidelity one shot transfer of heads Paper • 2502.18417 • Published 12 days ago • 62
Can Large Language Models Detect Errors in Long Chain-of-Thought Reasoning? Paper • 2502.19361 • Published 11 days ago • 26
TheoremExplainAgent: Towards Multimodal Explanations for LLM Theorem Understanding Paper • 2502.19400 • Published 11 days ago • 42