OmniFlow: Any-to-Any Generation with Multi-Modal Rectified Flows Paper • 2412.01169 • Published 24 days ago • 11
InstructAny2Pix: Flexible Visual Editing via Multimodal Instruction Following Paper • 2312.06738 • Published Dec 11, 2023
Mamba-ND: Selective State Space Modeling for Multi-Dimensional Data Paper • 2402.05892 • Published Feb 8
Scale-MAE: A Scale-Aware Masked Autoencoder for Multiscale Geospatial Representation Learning Paper • 2212.14532 • Published Dec 30, 2022 • 1
Scaling Properties of Diffusion Models for Perceptual Tasks Paper • 2411.08034 • Published Nov 12 • 13
Scaling Properties of Diffusion Models for Perceptual Tasks Paper • 2411.08034 • Published Nov 12 • 13
Diffusion Hyperfeatures: Searching Through Time and Space for Semantic Correspondence Paper • 2305.14334 • Published May 23, 2023 • 1
Readout Guidance: Learning Control from Diffusion Features Paper • 2312.02150 • Published Dec 4, 2023 • 3
Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis Paper • 2410.08261 • Published Oct 10 • 49
An Item is Worth a Prompt: Versatile Image Editing with Disentangled Control Paper • 2403.04880 • Published Mar 7 • 7
Interpolating Video-LLMs: Toward Longer-sequence LMMs in a Training-free Manner Paper • 2409.12963 • Published Sep 19
K-Sort Arena: Efficient and Reliable Benchmarking for Generative Models via K-wise Human Preferences Paper • 2408.14468 • Published Aug 26 • 35
Deep Multimodal Fusion for Surgical Feedback Classification Paper • 2312.03231 • Published Dec 6, 2023
Pose-Aware Self-Supervised Learning with Viewpoint Trajectory Regularization Paper • 2403.14973 • Published Mar 22