[Paper] [GitHub]

Robust perceptual metric, based on CLIP model laion/CLIP-ViT-B-16-laion2B-s34B-b88K

Adversarially fine-tuned with FARE (Schlarmann et al. (2024)) on ImageNet with infinity-norm and radius 4/255.

Performance on the perceptual similarity task NIGHTS:

Clean     L-inf, eps=4/255     L2, eps=3
90.6      71.5                 65.5

Usage

model, _, image_processor = open_clip.create_model_and_transforms('hf-hub:chs20/FARE4-ViT-B-16-laion2B-s34B-b88K')

Citation

If you find this model useful, please consider citing our papers:

@inproceedings{croce2024adversarially,
  title={Adversarially Robust CLIP Models Induce Better (Robust) Perceptual Metrics},
  author={Croce, Francesco and Schlarmann, Christian and Singh, Naman Deep and Hein, Matthias},
  year={2024},
  booktitle={{ICML Workshop on Foundation Models in the Wild}}
}
@inproceedings{schlarmann2024robustclip,
    title={Robust CLIP: Unsupervised Adversarial Fine-Tuning of Vision Embeddings for Robust Large Vision-Language Models}, 
    author={Schlarmann, Christian and Singh, Naman Deep and Croce, Francesco and Hein, Matthias},
    year={2024},
    booktitle={{ICML}}
}
Downloads last month
28
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Collection including chs20/FARE4-ViT-B-16-laion2B-s34B-b88K