jlisawu commited on
Commit
60cf6c7
1 Parent(s): 20f7200

first push

Browse files
Files changed (1) hide show
  1. README.md +127 -3
README.md CHANGED
@@ -1,8 +1,132 @@
1
  ---
 
 
 
 
 
2
  tags:
3
  - zero-shot-image-classification
4
  - clip
5
- library_tag: open_clip
6
- license: mit
 
 
 
 
 
 
 
 
 
 
 
7
  ---
8
- # Model card for bioclip-vit-b-16-inat-only
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ license:
3
+ - mit
4
+ language:
5
+ - en
6
+ library_name: open_clip
7
  tags:
8
  - zero-shot-image-classification
9
  - clip
10
+ - biology
11
+ - CV
12
+ - images
13
+ - animals
14
+ - species
15
+ - taxonomy
16
+ - rare species
17
+ - endangered species
18
+ - evolutionary biology
19
+ - multimodal
20
+ - knowledge-guided
21
+ datasets:
22
+ - iNat21
23
  ---
24
+
25
+
26
+ # Model Card for BioCLIP
27
+
28
+ <!--
29
+ This modelcard has been generated using [this raw template](https://github.com/huggingface/huggingface_hub/blob/main/src/huggingface_hub/templates/modelcard_template.md?plain=1). And further altered to suit Imageomics Institute needs -->
30
+
31
+ BioCLIP is a foundation model for the tree of life, built using CLIP architecture as a vision model for general organismal biology.
32
+ This model is trained on [iNat21](https://github.com/visipedia/inat_comp/tree/master/2021), different from [BioCLIP](https://huggingface.co/imageomics/bioclip) which is trained on [TreeOfLife-10M](https://huggingface.co/datasets/imageomics/TreeOfLife-10M). More information can be found in [BioCLIP](https://huggingface.co/imageomics/bioclip).
33
+
34
+
35
+ ## How to Get Started with the Model
36
+
37
+ BioCLIP can be used with the `open_clip` library:
38
+
39
+ ```py
40
+ import open_clip
41
+
42
+ model, preprocess_train, preprocess_val = open_clip.create_model_and_transforms('hf-hub:imageomics/bioclip-vit-b-16-inat-only')
43
+ tokenizer = open_clip.get_tokenizer('hf-hub:imageomics/bioclip-vit-b-16-inat-only')
44
+ ```
45
+
46
+ ## Training Details
47
+
48
+ ### Compute Infrastructure
49
+
50
+ Training was performed on 4 NVIDIA A100-80GB GPUs distributed over 1 node on [OSC's](https://www.osc.edu/) Ascend HPC Cluster with global batch size 16,384 for 2 days.
51
+
52
+ ### Training Data
53
+
54
+ This model was trained on [iNat21](https://github.com/visipedia/inat_comp/tree/master/2021), which is a compilation of images matched to [Linnaean taxonomic rank](https://www.britannica.com/science/taxonomy/The-objectives-of-biological-classification) from kingdom through species. They are also matched with common (vernacular) name of the subject of the image where available.
55
+ ### Training Hyperparameters
56
+
57
+ - **Training regime:**
58
+ Different from [BioCLIP](https://huggingface.co/imageomics/bioclip), this model is trained with a batch size of 16K. We pick epoch 65 with lowest loss on validation set (~5% of training samples) for downstream task evaluation.
59
+
60
+ ### Summary
61
+
62
+ BioCLIP outperforms general-domain baselines by 10% on average.
63
+
64
+ ### Model Examination
65
+
66
+ We encourage readers to see Section 4.6 of [our paper](https://doi.org/10.48550/arXiv.2311.18803).
67
+ In short, BioCLIP iNat21 only forms representations that more closely align to the taxonomic hierarchy compared to general-domain baselines like CLIP or OpenCLIP.
68
+
69
+
70
+ ## Citation
71
+
72
+ **BibTeX:**
73
+
74
+ ```
75
+ @software{bioclip2023,
76
+ author = {Samuel Stevens and Jiaman Wu and Matthew J. Thompson and Elizabeth G. Campolongo and Chan Hee Song and David Edward Carlyn and Li Dong and Wasila M. Dahdul and Charles Stewart and Tanya Berger-Wolf and Wei-Lun Chao and Yu Su},
77
+ doi = {10.57967/hf/1511},
78
+ month = nov,
79
+ title = {BioCLIP},
80
+ version = {v0.1},
81
+ year = {2023}
82
+ }
83
+ ```
84
+
85
+ Please also cite our paper:
86
+
87
+ ```
88
+ @article{stevens2023bioclip,
89
+ title = {BIOCLIP: A Vision Foundation Model for the Tree of Life},
90
+ author = {Samuel Stevens and Jiaman Wu and Matthew J Thompson and Elizabeth G Campolongo and Chan Hee Song and David Edward Carlyn and Li Dong and Wasila M Dahdul and Charles Stewart and Tanya Berger-Wolf and Wei-Lun Chao and Yu Su},
91
+ year = {2023},
92
+ eprint = {2311.18803},
93
+ archivePrefix = {arXiv},
94
+ primaryClass = {cs.CV}
95
+ }
96
+
97
+ ```
98
+
99
+
100
+ Please also consider citing OpenCLIP and iNat21:
101
+ ```
102
+ @software{ilharco_gabriel_2021_5143773,
103
+ author={Ilharco, Gabriel and Wortsman, Mitchell and Wightman, Ross and Gordon, Cade and Carlini, Nicholas and Taori, Rohan and Dave, Achal and Shankar, Vaishaal and Namkoong, Hongseok and Miller, John and Hajishirzi, Hannaneh and Farhadi, Ali and Schmidt, Ludwig},
104
+ title={OpenCLIP},
105
+ year={2021},
106
+ doi={10.5281/zenodo.5143773},
107
+ }
108
+ ```
109
+ ```
110
+ @misc{inat2021,
111
+ author={Van Horn, Grant and Mac Aodha, Oisin},
112
+ title={iNat Challenge 2021 - FGVC8},
113
+ publisher={Kaggle},
114
+ year={2021},
115
+ url={https://kaggle.com/competitions/inaturalist-2021}
116
+ }
117
+ ```
118
+
119
+ ## Acknowledgements
120
+
121
+ The authors would like to thank Josef Uyeda, Jim Balhoff, Dan Rubenstein, Hank Bart, Hilmar Lapp, Sara Beery, and colleagues from the Imageomics Institute and the OSU NLP group for their valuable feedback. We also thank the BIOSCAN-1M team and the iNaturalist team for making their data available and easy to use, and Jennifer Hammack at EOL for her invaluable help in accessing EOL’s images.
122
+
123
+ The [Imageomics Institute](https://imageomics.org) is funded by the US National Science Foundation's Harnessing the Data Revolution (HDR) program under [Award #2118240](https://www.nsf.gov/awardsearch/showAward?AWD_ID=2118240) (Imageomics: A New Frontier of Biological Information Powered by Knowledge-Guided Machine Learning). Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.
124
+
125
+
126
+ ## Model Card Authors
127
+
128
+ Elizabeth G. Campolongo, Samuel Stevens, and Jiaman Wu
129
+
130
+ ## Model Card Contact
131
+
132
+ [stevens.994@osu.edu](mailto:stevens.994@osu.edu)