romrawinjp
/

clip-kd_ViT-T-16-Laion400M_KD-CC3M12M

Zero-Shot Image Classification

Model card Files Files and versions Community

Model card for CLIP ViT-T-16 distilled with CC3M and CC12M from CLIP ViT-B-16 Laion400m Teacher

From weight: ViT-B-16_cc3m_12m_kd_ViT-T-16_cc3m_12m_ep32.pt

Model Details

Model Description

A CLIP ViT-T-16 distilled with CC3M and CC12M from CLIP ViT-B-16 Laion400m teacher.

Reference

Please refer to the original work.

@inproceedings{yang2024clip,
  title={CLIP-KD: An Empirical Study of CLIP Model Distillation},
  author={Yang, Chuanguang and An, Zhulin and Huang, Libo and Bi, Junyu and Yu, Xinqiang and Yang, Han and Diao, Boyu and Xu, Yongjun},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year={2024}
}

Downloads last month: 23

Safetensors

Model size

46.1M params

Tensor type

F32

·

Inference Providers NEW

Zero-Shot Image Classification

This model is not currently available via any of the supported Inference Providers.

Datasets used to train romrawinjp/clip-kd_ViT-T-16-Laion400M_KD-CC3M12M