view article Article A Deepdive into Aya Vision: Advancing the Frontier of Multilingual Multimodality 6 days ago • 57
SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features Paper • 2502.14786 • Published 17 days ago • 128
SearchRAG: Can Search Engines Be Helpful for LLM-based Medical Question Answering? Paper • 2502.13233 • Published 19 days ago • 13
Baichuan-M1: Pushing the Medical Capability of Large Language Models Paper • 2502.12671 • Published 19 days ago • 1
Scaling Test-Time Compute Without Verification or RL is Suboptimal Paper • 2502.12118 • Published 20 days ago • 1
Soundwave: Less is More for Speech-Text Alignment in LLMs Paper • 2502.12900 • Published 19 days ago • 76
Is Noise Conditioning Necessary for Denoising Generative Models? Paper • 2502.13129 • Published 19 days ago • 1
ComplexFuncBench: Exploring Multi-Step and Constrained Function Calling under Long-Context Scenario Paper • 2501.10132 • Published Jan 17 • 19
VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding Paper • 2501.13106 • Published Jan 22 • 84
Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate Paper • 2501.17703 • Published Jan 29 • 55
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper • 2502.02737 • Published Feb 4 • 199
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling Paper • 2502.06703 • Published 27 days ago • 142
Scaling Pre-training to One Hundred Billion Data for Vision Language Models Paper • 2502.07617 • Published 26 days ago • 29