PartGen: Part-level 3D Generation and Reconstruction with Multi-View Diffusion Models Paper • 2412.18608 • Published 1 day ago • 6
DiTCtrl: Exploring Attention Control in Multi-Modal Diffusion Transformer for Tuning-Free Multi-Prompt Longer Video Generation Paper • 2412.18597 • Published 1 day ago • 12
3DGraphLLM: Combining Semantic Graphs and Large Language Models for 3D Scene Understanding Paper • 2412.18450 • Published 1 day ago • 26
Toward Robust Hyper-Detailed Image Captioning: A Multiagent Approach and Dual Evaluation Metrics for Factuality and Coverage Paper • 2412.15484 • Published 6 days ago • 13
Offline Reinforcement Learning for LLM Multi-Step Reasoning Paper • 2412.16145 • Published 5 days ago • 32
PC Agent: While You Sleep, AI Works -- A Cognitive Journey into Digital World Paper • 2412.17589 • Published 3 days ago • 8
Revisiting In-Context Learning with Long Context Language Models Paper • 2412.16926 • Published 4 days ago • 19
DRT-o1: Optimized Deep Reasoning Translation via Long Chain-of-Thought Paper • 2412.17498 • Published 3 days ago • 15
Large Motion Video Autoencoding with Cross-modal Video VAE Paper • 2412.17805 • Published 2 days ago • 20
Diving into Self-Evolving Training for Multimodal Reasoning Paper • 2412.17451 • Published 3 days ago • 32
B-STaR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners Paper • 2412.17256 • Published 3 days ago • 34
AceMath: Advancing Frontier Math Reasoning with Post-Training and Reward Modeling Paper • 2412.15084 • Published 6 days ago • 12
Affordance-Aware Object Insertion via Mask-Aware Dual Diffusion Paper • 2412.14462 • Published 7 days ago • 15
Compressed Chain of Thought: Efficient Reasoning Through Dense Representations Paper • 2412.13171 • Published 8 days ago • 30
ChatDiT: A Training-Free Baseline for Task-Agnostic Free-Form Chatting with Diffusion Transformers Paper • 2412.12571 • Published 9 days ago • 8