Model Card for Model ID
Model Description
This model was finetuned for the "Feather in Focus!" Kaggle competition of the Information Studies Master's Applied Machine Learning course at the University of Amsterdam. The goal of the competition was to apply novel approaches to achieve the highest possible accuracy on a bird classification task with 200 classes. We were given a labeled dataset of 3926 images and an unlabeled dataset of 4000 test images. Out of 32 groups and 1083 submissions, we achieved the #1 accuracy on the test set with a score of 0.87950.
Training Details
The model we are finetuning, microsoft/swin-large-patch4-window12-384-in22k, was pre-trained on imagenet-21k, see https://huggingface.co./microsoft/swin-large-patch4-window12-384-in22k.
Preprocessing
Data augmentation was applied to the training data in a custom Torch dataset class. Because of the size of the dataset, images were not replaced but were duplicated and augmented. The only augmentations applied were HorizontalFlips and Rotations (10 degrees) to align with the relatively homogeneous dataset.
Finetuning
Finetuning was done on some 50 different models including different VTs and CNNs. All models were trained for 10 epochs with the best model, based on the evaluation acccuracy, saved every epoch.
Finetuning Data
The finetuning data is a subset of the cub-200-2011 dataset, http://www.vision.caltech.edu/datasets/cub_200_2011/. We finetuned the model on 3533 samples of the labeled dataset we were given, stratified on the label (7066 including augmented images).
Finetuning Hyperparameters
Hyperparameter | Value |
---|---|
Optimizer | AdamW |
Learning Rate | 1e-4 |
Batch Size | 32 |
Epochs | 2 |
Weight Decay | * |
Class Weight | * |
Label Smoothing | * |
Scheduler | * |
Mixed Precision | Torch AMP |
*parameters were intentionally not set because of poor results
Evaluation Data
The evaluation data is a subset of the cub-200-2011 dataset, http://www.vision.caltech.edu/datasets/cub_200_2011/. We evaluated the model on 393 samples of the labeled dataset we were given, stratified on the label.
Testing Data
The testing data is a subset of an unlabeled subset of the cub-200-2011 dataset, http://www.vision.caltech.edu/datasets/cub_200_2011/ of 4000 images. After model finetuning the best model, based on the evaluation data, would be loaded. This model would then be used to predict the labels of the unlabeled test set. These predicted labels were submitted to the Kaggle competition via CSV which returned the test accuracy.
Poster
*novel approaches were not applied when finetuning the final model as they did not improve accuracy.
- Downloads last month
- 8
Model tree for Emiel/cub-200-bird-classifier-swin
Evaluation results
- validation_accuracy on cub-200-2011self-reported0.865
- test_accuracy on cub-200-2011self-reported0.879