Model Card

Model Details

Architecture: ViT-Large with patch size 14
Training Data: stanford cars dataset

Training Details

Adam Optimizer with a constant learning rate 1e-5 for 4000 steps training (batch_size=32). Only the vision encoder is fine-tuned.

Evaluation Results

pre-trained: 0.7770098447799683
fine-tuned: 0.92734694480896

Downloads last month: 491

Safetensors

Model size

303M params

Tensor type

F32

Inference Examples

Feature Extraction

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for tanganke/clip-vit-large-patch14_stanford-cars

Base model

openai/clip-vit-large-patch14

Finetuned

(51)

this model

Dataset used to train tanganke/clip-vit-large-patch14_stanford-cars

Collection including tanganke/clip-vit-large-patch14_stanford-cars

CLIP-ViT-L/14 on the eight image classification tasks

Collection

if you find these models helpful, consider citing [our paper](https://arxiv.org/abs/2406.03280) • 9 items • Updated Aug 27