Babel: Open Multilingual Large Language Models Serving Over 90% of Global Speakers Paper • 2503.00865 • Published 7 days ago • 55
DoraCycle: Domain-Oriented Adaptation of Unified Generative Model in Multimodal Cycles Paper • 2503.03651 • Published 4 days ago • 15
DoraCycle: Domain-Oriented Adaptation of Unified Generative Model in Multimodal Cycles Paper • 2503.03651 • Published 4 days ago • 15
DoraCycle: Domain-Oriented Adaptation of Unified Generative Model in Multimodal Cycles Paper • 2503.03651 • Published 4 days ago • 15 • 2
Difix3D+: Improving 3D Reconstructions with Single-Step Diffusion Models Paper • 2503.01774 • Published 6 days ago • 37
PhotoDoodle: Learning Artistic Image Editing from Few-Shot Pairwise Data Paper • 2502.14397 • Published 18 days ago • 38
AdaptiveStep: Automatically Dividing Reasoning Step through Model Confidence Paper • 2502.13943 • Published 18 days ago • 7
Phantom: Subject-consistent video generation via cross-modal alignment Paper • 2502.11079 • Published 21 days ago • 52
I Think, Therefore I Diffuse: Enabling Multimodal In-Context Reasoning in Diffusion Models Paper • 2502.10458 • Published 26 days ago • 30
HermesFlow: Seamlessly Closing the Gap in Multimodal Understanding and Generation Paper • 2502.12148 • Published 20 days ago • 16
ReLearn: Unlearning via Learning for Large Language Models Paper • 2502.11190 • Published 21 days ago • 29
Learning Getting-Up Policies for Real-World Humanoid Robots Paper • 2502.12152 • Published 20 days ago • 37
Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model Paper • 2502.10248 • Published 23 days ago • 51
Light-A-Video: Training-free Video Relighting via Progressive Light Fusion Paper • 2502.08590 • Published 25 days ago • 40
CineMaster: A 3D-Aware and Controllable Framework for Cinematic Text-to-Video Generation Paper • 2502.08639 • Published 25 days ago • 37
WorldGUI: Dynamic Testing for Comprehensive Desktop GUI Automation Paper • 2502.08047 • Published 26 days ago • 26