FancyVideo: Towards Dynamic and Consistent Video Generation via Cross-frame Textual Guidance Paper • 2408.08189 • Published Aug 15, 2024 • 17
An Empirical Study of Autoregressive Pre-training from Videos Paper • 2501.05453 • Published 4 days ago • 30
MoDec-GS: Global-to-Local Motion Decomposition and Temporal Interval Adjustment for Compact Dynamic 3D Gaussian Splatting Paper • 2501.03714 • Published 6 days ago • 8
Diffusion as Shader: 3D-aware Video Diffusion for Versatile Video Generation Control Paper • 2501.03847 • Published 6 days ago • 18
SeedVR: Seeding Infinity in Diffusion Transformer Towards Generic Video Restoration Paper • 2501.01320 • Published 11 days ago • 10
PERSE: Personalized 3D Generative Avatars from A Single Portrait Paper • 2412.21206 • Published 14 days ago • 15
Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey Paper • 2412.18619 • Published 28 days ago • 52
DI-PCG: Diffusion-based Efficient Inverse Procedural Content Generation for High-quality 3D Asset Creation Paper • 2412.15200 • Published 25 days ago • 9
Autoregressive Video Generation without Vector Quantization Paper • 2412.14169 • Published 26 days ago • 14
SynCamMaster: Synchronizing Multi-Camera Video Generation from Diverse Viewpoints Paper • 2412.07760 • Published Dec 10, 2024 • 50
MOVIS: Enhancing Multi-Object Novel View Synthesis for Indoor Scenes Paper • 2412.11457 • Published 28 days ago • 5
FreeScale: Unleashing the Resolution of Diffusion Models via Tuning-Free Scale Fusion Paper • 2412.09626 • Published Dec 12, 2024 • 20
FreeSplatter: Pose-free Gaussian Splatting for Sparse-view 3D Reconstruction Paper • 2412.09573 • Published Dec 12, 2024 • 7
POINTS1.5: Building a Vision-Language Model towards Real World Applications Paper • 2412.08443 • Published Dec 11, 2024 • 38
ACDiT: Interpolating Autoregressive Conditional Modeling and Diffusion Transformer Paper • 2412.07720 • Published Dec 10, 2024 • 30
MotionShop: Zero-Shot Motion Transfer in Video Diffusion Models with Mixture of Score Guidance Paper • 2412.05355 • Published Dec 6, 2024 • 7
Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling Paper • 2412.05271 • Published Dec 6, 2024 • 125
Improved Distribution Matching Distillation for Fast Image Synthesis Paper • 2405.14867 • Published May 23, 2024 • 12
LiFT: Leveraging Human Feedback for Text-to-Video Model Alignment Paper • 2412.04814 • Published Dec 6, 2024 • 45