-
Instruct-Imagen: Image Generation with Multi-modal Instruction
Paper • 2401.01952 • Published • 30 -
ODIN: A Single Model for 2D and 3D Perception
Paper • 2401.02416 • Published • 11 -
Bigger is not Always Better: Scaling Properties of Latent Diffusion Models
Paper • 2404.01367 • Published • 20 -
Cross-Attention Makes Inference Cumbersome in Text-to-Image Diffusion Models
Paper • 2404.02747 • Published • 11
Collections
Discover the best community collections!
Collections including paper arxiv:2406.09413
-
Same Task, More Tokens: the Impact of Input Length on the Reasoning Performance of Large Language Models
Paper • 2402.14848 • Published • 18 -
The Prompt Report: A Systematic Survey of Prompting Techniques
Paper • 2406.06608 • Published • 53 -
CRAG -- Comprehensive RAG Benchmark
Paper • 2406.04744 • Published • 41 -
Transformers meet Neural Algorithmic Reasoners
Paper • 2406.09308 • Published • 43
-
pOps: Photo-Inspired Diffusion Operators
Paper • 2406.01300 • Published • 16 -
AsyncDiff: Parallelizing Diffusion Models by Asynchronous Denoising
Paper • 2406.06911 • Published • 10 -
Interpreting the Weight Space of Customized Diffusion Models
Paper • 2406.09413 • Published • 18 -
EMMA: Your Text-to-Image Diffusion Model Can Secretly Accept Multi-Modal Prompts
Paper • 2406.09162 • Published • 13
-
CatLIP: CLIP-level Visual Recognition Accuracy with 2.7x Faster Pre-training on Web-scale Image-Text Data
Paper • 2404.15653 • Published • 26 -
MoDE: CLIP Data Experts via Clustering
Paper • 2404.16030 • Published • 12 -
MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning
Paper • 2405.12130 • Published • 45 -
Reducing Transformer Key-Value Cache Size with Cross-Layer Attention
Paper • 2405.12981 • Published • 28
-
EdgeFusion: On-Device Text-to-Image Generation
Paper • 2404.11925 • Published • 21 -
Dynamic Typography: Bringing Words to Life
Paper • 2404.11614 • Published • 43 -
ControlNet++: Improving Conditional Controls with Efficient Consistency Feedback
Paper • 2404.07987 • Published • 47 -
Applying Guidance in a Limited Interval Improves Sample and Distribution Quality in Diffusion Models
Paper • 2404.07724 • Published • 12
-
Faster Diffusion: Rethinking the Role of UNet Encoder in Diffusion Models
Paper • 2312.09608 • Published • 13 -
CodeFusion: A Pre-trained Diffusion Model for Code Generation
Paper • 2310.17680 • Published • 69 -
ZeroNVS: Zero-Shot 360-Degree View Synthesis from a Single Real Image
Paper • 2310.17994 • Published • 8 -
Progressive Knowledge Distillation Of Stable Diffusion XL Using Layer Level Loss
Paper • 2401.02677 • Published • 22