Model Card for BioCLIP
BioCLIP is a foundation model for the tree of life, built using CLIP architecture as a vision model for general organismal biology. This model is trained on iNat21, different from BioCLIP which is trained on TreeOfLife-10M. More information can be found in BioCLIP.
How to Get Started with the Model
BioCLIP can be used with the open_clip
library:
import open_clip
model, preprocess_train, preprocess_val = open_clip.create_model_and_transforms('hf-hub:imageomics/bioclip-vit-b-16-inat-only')
tokenizer = open_clip.get_tokenizer('hf-hub:imageomics/bioclip-vit-b-16-inat-only')
Training Details
Compute Infrastructure
Training was performed on 4 NVIDIA A100-80GB GPUs distributed over 1 node on OSC's Ascend HPC Cluster with global batch size 16,384 for 2 days.
Based on Machine Learning Impact calculator presented in Lacoste et al. (2019), that's 33.16 kg of CO2 eq., or 134km driven by an average ICE car.
Training Data
This model was trained on iNat21, which is a compilation of images matched to Linnaean taxonomic rank from kingdom through species. They are also matched with common (vernacular) name of the subject of the image where available.
Training Hyperparameters
- Training regime: Different from BioCLIP, this model is trained with a batch size of 16K. We pick epoch 65 with lowest loss on validation set (~5% of training samples) for downstream task evaluation.
Summary
BioCLIP outperforms general-domain baselines by 10% on average.
Model Examination
We encourage readers to see Section 4.6 of our paper. In short, BioCLIP iNat21 only forms representations that more closely align to the taxonomic hierarchy compared to general-domain baselines like CLIP or OpenCLIP.
Citation
BibTeX:
@software{bioclip2023,
author = {Samuel Stevens and Jiaman Wu and Matthew J. Thompson and Elizabeth G. Campolongo and Chan Hee Song and David Edward Carlyn and Li Dong and Wasila M. Dahdul and Charles Stewart and Tanya Berger-Wolf and Wei-Lun Chao and Yu Su},
doi = {10.57967/hf/1511},
month = nov,
title = {BioCLIP},
version = {v0.1},
year = {2023}
}
Please also cite our paper:
@article{stevens2023bioclip,
title = {BIOCLIP: A Vision Foundation Model for the Tree of Life},
author = {Samuel Stevens and Jiaman Wu and Matthew J Thompson and Elizabeth G Campolongo and Chan Hee Song and David Edward Carlyn and Li Dong and Wasila M Dahdul and Charles Stewart and Tanya Berger-Wolf and Wei-Lun Chao and Yu Su},
year = {2023},
eprint = {2311.18803},
archivePrefix = {arXiv},
primaryClass = {cs.CV}
}
Please also consider citing OpenCLIP and iNat21:
@software{ilharco_gabriel_2021_5143773,
author={Ilharco, Gabriel and Wortsman, Mitchell and Wightman, Ross and Gordon, Cade and Carlini, Nicholas and Taori, Rohan and Dave, Achal and Shankar, Vaishaal and Namkoong, Hongseok and Miller, John and Hajishirzi, Hannaneh and Farhadi, Ali and Schmidt, Ludwig},
title={OpenCLIP},
year={2021},
doi={10.5281/zenodo.5143773},
}
@misc{inat2021,
author={Van Horn, Grant and Mac Aodha, Oisin},
title={iNat Challenge 2021 - FGVC8},
publisher={Kaggle},
year={2021},
url={https://kaggle.com/competitions/inaturalist-2021}
}
Acknowledgements
The authors would like to thank Josef Uyeda, Jim Balhoff, Dan Rubenstein, Hank Bart, Hilmar Lapp, Sara Beery, and colleagues from the Imageomics Institute and the OSU NLP group for their valuable feedback. We also thank the BIOSCAN-1M team and the iNaturalist team for making their data available and easy to use, and Jennifer Hammack at EOL for her invaluable help in accessing EOL’s images.
The Imageomics Institute is funded by the US National Science Foundation's Harnessing the Data Revolution (HDR) program under Award #2118240 (Imageomics: A New Frontier of Biological Information Powered by Knowledge-Guided Machine Learning). Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.
Model Card Authors
Elizabeth G. Campolongo, Samuel Stevens, and Jiaman Wu
Model Card Contact
- Downloads last month
- 16