InstantCharacter: Personalize Any Characters with a Scalable Diffusion Transformer Framework Paper β’ 2504.12395 β’ Published 2 days ago β’ 11
Packing Input Frame Context in Next-Frame Prediction Models for Video Generation Paper β’ 2504.12626 β’ Published 2 days ago β’ 24
SimpleAR: Pushing the Frontier of Autoregressive Visual Generation through Pretraining, SFT, and RL Paper β’ 2504.11455 β’ Published 3 days ago β’ 10
MineWorld: a Real-Time and Open-Source Interactive World Model on Minecraft Paper β’ 2504.08388 β’ Published 8 days ago β’ 37
Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model Paper β’ 2504.08685 β’ Published 8 days ago β’ 117
FlexIP: Dynamic Control of Preservation and Personality for Customized Image Generation Paper β’ 2504.07405 β’ Published 9 days ago β’ 9
Skywork R1V: Pioneering Multimodal Reasoning with Chain-of-Thought Paper β’ 2504.05599 β’ Published 11 days ago β’ 79
One-Minute Video Generation with Test-Time Training Paper β’ 2504.05298 β’ Published 11 days ago β’ 94
Less-to-More Generalization: Unlocking More Controllability by In-Context Generation Paper β’ 2504.02160 β’ Published 16 days ago β’ 33
ConsistentID: Portrait Generation with Multimodal Fine-Grained Identity Preserving Paper β’ 2404.16771 β’ Published Apr 25, 2024 β’ 20
IDAdapter: Learning Mixed Features for Tuning-Free Personalization of Text-to-Image Models Paper β’ 2403.13535 β’ Published Mar 20, 2024 β’ 24
InstantFamily: Masked Attention for Zero-shot Multi-ID Image Generation Paper β’ 2404.19427 β’ Published Apr 30, 2024 β’ 76
DynamicID: Zero-Shot Multi-ID Image Personalization with Flexible Facial Editability Paper β’ 2503.06505 β’ Published Mar 9 β’ 1
ILLUME+: Illuminating Unified MLLM with Dual Visual Tokenization and Diffusion Refinement Paper β’ 2504.01934 β’ Published 16 days ago β’ 22
VARGPT-v1.1: Improve Visual Autoregressive Large Unified Model via Iterative Instruction Tuning and Reinforcement Learning Paper β’ 2504.02949 β’ Published 15 days ago β’ 19
Inference-Time Scaling for Generalist Reward Modeling Paper β’ 2504.02495 β’ Published 16 days ago β’ 52
GPT-ImgEval: A Comprehensive Benchmark for Diagnosing GPT4o in Image Generation Paper β’ 2504.02782 β’ Published 15 days ago β’ 55
SkyReels-A2: Compose Anything in Video Diffusion Transformers Paper β’ 2504.02436 β’ Published 16 days ago β’ 35