Diffusion Model Alignment Using Direct Preference Optimization
Abstract
Large language models (LLMs) are fine-tuned using human comparison data with Reinforcement Learning from Human Feedback (RLHF) methods to make them better aligned with users' preferences. In contrast to LLMs, human preference learning has not been widely explored in text-to-image diffusion models; the best existing approach is to fine-tune a pretrained model using carefully curated high quality images and captions to improve visual appeal and text alignment. We propose Diffusion-DPO, a method to align diffusion models to human preferences by directly optimizing on human comparison data. Diffusion-DPO is adapted from the recently developed Direct Preference Optimization (DPO), a simpler alternative to RLHF which directly optimizes a policy that best satisfies human preferences under a classification objective. We re-formulate DPO to account for a diffusion model notion of likelihood, utilizing the evidence lower bound to derive a differentiable objective. Using the Pick-a-Pic dataset of 851K crowdsourced pairwise preferences, we fine-tune the base model of the state-of-the-art Stable Diffusion XL (SDXL)-1.0 model with Diffusion-DPO. Our fine-tuned base model significantly outperforms both base SDXL-1.0 and the larger SDXL-1.0 model consisting of an additional refinement model in human evaluation, improving visual appeal and prompt alignment. We also develop a variant that uses AI feedback and has comparable performance to training on human preferences, opening the door for scaling of diffusion model alignment methods.
Community
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Using Human Feedback to Fine-tune Diffusion Models without Any Reward Model (2023)
- Text-to-Sticker: Style Tailoring Latent Diffusion Models for Human Expression (2023)
- Aligning Text-to-Image Diffusion Models with Reward Backpropagation (2023)
- Enhancing Diffusion Models with Text-Encoder Reinforcement Learning (2023)
- SuperHF: Supervised Iterative Learning from Human Feedback (2023)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Using Human Feedback to Fine-tune Diffusion Models without Any Reward Model (2023)
- Text-to-Sticker: Style Tailoring Latent Diffusion Models for Human Expression (2023)
- Enhancing Diffusion Models with Text-Encoder Reinforcement Learning (2023)
- SuperHF: Supervised Iterative Learning from Human Feedback (2023)
- BeautifulPrompt: Towards Automatic Prompt Engineering for Text-to-Image Synthesis (2023)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
Aligning AI Art: Diffusion-DPO Explained!
Links π:
π Subscribe: https://www.youtube.com/@Arxflix
π Twitter: https://x.com/arxflix
π LMNT (Partner): https://lmnt.com/
Models citing this paper 5
Browse 5 models citing this paperDatasets citing this paper 0
No dataset linking this paper