Image-Text-to-Text
PEFT
Safetensors

ViPer: Visual Personalization of Generative Models via Individual Preference Learning

Tuning-free framework for personalized image generation

Website | Paper | GitHub | BibTeX

We introduce ViPer, a method that personalizes the output of generative models to align with different users’ visual preferences for the same prompt. This is done via a one-time capture of the user’s general preferences and conditioning the generative model on them without the need for engineering detailed prompts.

Installation

For install instructions, please see https://github.com/EPFL-VILAB/ViPer.

Usage

This model can be loaded from Hugging Face Hub as follows:

from transformers import AutoProcessor, BitsAndBytesConfig, AutoModelForVision2Seq
from peft import PeftModel

model = AutoModelForVision2Seq.from_pretrained("HuggingFaceM4/idefics2-8b")
model = PeftModel.from_pretrained(model, "EPFL-VILAB/Metric-ViPer")

Please see https://github.com/EPFL-VILAB/ViPer for more detailed instructions.

Citation

If you find this repository helpful, please consider citing our work:

@article{ViPer,
  title={{ViPer}: Visual Personalization of Generative Models via Individual Preference Learning},
  author={Sogand Salehi and Mahdi Shafiei and Teresa Yeo and Roman Bachmann and Amir Zamir},
  journal={arXiv preprint arXiv:2407.17365},
  year={2024},
}

License

Licensed under the Apache License, Version 2.0. See LICENSE for details.

Downloads last month
23
Inference API
Inference API (serverless) does not yet support peft models for this pipeline type.