--- license: apache-2.0 language: - en tags: - biology - CV - images - animals - image classification - fine-grained classification - butterflies - birds - interpretable - transformers - cross-attention metrics: --- # Model Card for INTR: A Simple Interpretable Transformer for Fine-grained Image Classification and Analysis INTR checkpoint on CUB dataset with backbone DETR-R50 ## Model Details ### Model Description - **Developed by:** Dipanjyoti Paul, Arpita Chowdhury, Xinqi Xiong, Feng-Ju Chang, David Carlyn, Samuel Stevens, Kaiya Provost, Anuj Karpatne, Bryan Carstens, Daniel Rubenstein, Charles Stewart, Tanya Berger-Wolf, Yu Su, and Wei-Lun Chao - **Shared by [optional]:** [More Information Needed] - **Model type:** [More Information Needed] - **License:** Apache 2.0 - **Fine-tuned from model:** [DETR-R50](https://github.com/facebookresearch/detr) ### Model Sources - **Repository:** [Imageomics/INTR](https://github.com/Imageomics/INTR) - **Paper:** [A Simple Interpretable Transformer for Fine-Grained Image Classification and Analysis](https://doi.org/10.48550/arXiv.2311.04157) - **Demo:** [Inference time single-image prediction and visualization notebook](https://github.com/Imageomics/INTR/blob/main/demo.ipynb). Note that this is focused on the CUB dataset. ## Uses ### Direct Use [More Information Needed] ### Downstream Use [optional] [More Information Needed] ### Out-of-Scope Use [More Information Needed] ## Bias, Risks, and Limitations [More Information Needed] ### Recommendations Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. ## How to Get Started with the Model Use the code below to get started with the model. [More Information Needed] ## Training Details ### Training Data [More Information Needed] | Model | Dataset | |----------|----------| | [CUB checkpoint download](https://huggingface.co/imageomics/INTR/resolve/main/intr_checkpoint_cub_detr_r50.pth)| [CUB](https://www.vision.caltech.edu/datasets/cub_200_2011/) [More Information Needed] | | [Bird checkpoint download](https://huggingface.co/imageomics/INTR/resolve/main/intr_checkpoint_bird_detr_r50.pth) | [Birds 525](https://www.kaggle.com/datasets/gpiosenka/100-bird-species) [More Information Needed] | | [Butterfly checkpoint download](https://huggingface.co/imageomics/INTR/resolve/main/intr_checkpoint_butterfly_detr_r50.pth) | [Cambridge Butterfly](https://huggingface.co/datasets/imageomics/Cambridge_butterfly), images in the `train` folder | ### Training Procedure [More Information Needed] Follow the below format for data. ``` datasets ├── dataset_name │ ├── train │ │ ├── class1 │ │ │ ├── img1.jpeg │ │ │ ├── img2.jpeg │ │ │ └── ... │ │ ├── class2 │ │ │ ├── img3.jpeg │ │ │ └── ... │ │ └── ... │ └── val │ ├── class1 │ │ ├── img4.jpeg │ │ ├── img5.jpeg │ │ └── ... │ ├── class2 │ │ ├── img6.jpeg │ │ └── ... │ └── ... ``` #### Preprocessing [optional] [More Information Needed] #### Training Hyperparameters - **Training regime:** [More Information Needed] #### Speeds, Sizes, Times [optional] [More Information Needed] ## Evaluation To evaluate the performance of INTR on the _CUB_ dataset, on a multi-GPU (e.g., 4 GPUs) settings, execute the below command. INTR checkpoints are available at Fine-tune model and results. ```sh CUDA_VISIBLE_DEVICES=0,1,2,3 python -m torch.distributed.launch --nproc_per_node=4 --master_port 12345 --use_env main.py --eval --resume --dataset_path --dataset_name ``` Similarly, replace `cub` in the name of the checkpoint with `bird` or `butterfly` to evaluate with the Birds 525 or Cambridge Butterfly checkpoint, respectively. ### Testing Data, Factors & Metrics #### Testing Data [More Information Needed] | Model | Dataset | |----------|----------| | [CUB checkpoint download](https://huggingface.co/imageomics/INTR/resolve/main/intr_checkpoint_cub_detr_r50.pth)| [CUB](https://www.vision.caltech.edu/datasets/cub_200_2011/) [More Information Needed] | | [Birds checkpoint download](https://huggingface.co/imageomics/INTR/resolve/main/intr_checkpoint_bird_detr_r50.pth) | [Birds 525](https://www.kaggle.com/datasets/gpiosenka/100-bird-species) [More Information Needed] | | [Butterfly checkpoint download](https://huggingface.co/imageomics/INTR/resolve/main/intr_checkpoint_butterfly_detr_r50.pth) | [Cambridge Butterfly](https://huggingface.co/datasets/imageomics/Cambridge_butterfly), images in the `val` folder | #### Factors [More Information Needed] #### Metrics [More Information Needed] ### Results | Dataset | acc@1 | acc@5 | |----------|----------|----------| | [CUB](https://www.vision.caltech.edu/datasets/cub_200_2011/) | 71.8 | 89.3 | | [Birds 525](https://www.kaggle.com/datasets/gpiosenka/100-bird-species) | 97.4 | 99.2 | | [Butterfly](https://huggingface.co/datasets/imageomics/Cambridge_butterfly) | 95.0 | 98.3 | #### Summary [More Information Needed] ## Model Examination [optional] [More Information Needed] ## Environmental Impact Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://doi.org/10.48550/arXiv.1910.09700). - **Hardware Type:** [More Information Needed] - **Hours used:** [More Information Needed] - **Cloud Provider:** [More Information Needed] - **Compute Region:** [More Information Needed] - **Carbon Emitted:** [More Information Needed] ## Technical Specifications [optional] ### Model Architecture and Objective [More Information Needed] ### Compute Infrastructure [More Information Needed] #### Hardware [More Information Needed] #### Software [More Information Needed] ## Citation **BibTeX:** [![Paper](https://img.shields.io/badge/Paper-10.48550%2FarXiv.2311.04157-blue)](https://doi.org/10.48550/arXiv.2311.04157) If you find our work helpful for your research, please consider citing our paper as well. ``` @article{paul2023simple, title={A Simple Interpretable Transformer for Fine-Grained Image Classification and Analysis}, author={Paul, Dipanjyoti and Chowdhury, Arpita and Xiong, Xinqi and Chang, Feng-Ju and Carlyn, David and Stevens, Samuel and Provost, Kaiya and Karpatne, Anuj and Carstens, Bryan and Rubenstein, Daniel and Stewart, Charles and Berger-Wolf, Tanya and Su, Yu and Chao, Wei-Lun}, journal={arXiv preprint arXiv:2311.04157}, year={2023} } ``` Model Citation: ``` @software{Paul_A_Simple_Interpretable_2023, author = {Paul, Dipanjyoti and Chowdhury, Arpita and Xiong, Xinqi and Chang, Feng-Ju and Carlyn, David and Stevens, Samuel and Provost, Kaiya and Karpatne, Anuj and Carstens, Bryan and Rubenstein, Daniel and Stewart, Charles and Berger-Wolf, Tanya and Su, Yu and Chao, Wei-Lun}, license = {Apache-2.0}, title = {{A Simple Interpretable Transformer for Fine-Grained Image Classification and Analysis}}, doi = {}, url = {https://huggingface.co/imageomics/INTR}, version = {1.0.0}, month = sep, year = {2023} } ``` **APA:** Paper: Paul, D., Chowdhury, A., Xiong, X., Chang, F., Carlyn, D., Stevens, S., Provost, K., Karpatne, A., Carstens, B., Rubenstein, D., Stewart, C., Berger-Wolf, T., Su, Y., & Chao, W. (2023). A Simple Interpretable Transformer for Fine-Grained Image Classification and Analysis. arXiv. https://doi.org/10.48550/arXiv.2311.04157. Model Citation: Paul, D., Chowdhury, A., Xiong, X., Chang, F., Carlyn, D., Stevens, S., Provost, K., Karpatne, A., Carstens, B., Rubenstein, D., Stewart, C., Berger-Wolf, T., Su, Y., & Chao, W. (2023). A Simple Interpretable Transformer for Fine-Grained Image Classification and Analysis (Version 1.0.0). ## Acknowledgements Our model is inspired by the DEtection TRansformer [(DETR)](https://github.com/facebookresearch/detr) method. We thank the authors of DETR for doing such great work. The [Imageomics Institute](https://imageomics.org) is funded by the US National Science Foundation's Harnessing the Data Revolution (HDR) program under [Award #2118240](https://www.nsf.gov/awardsearch/showAward?AWD_ID=2118240) (Imageomics: A New Frontier of Biological Information Powered by Knowledge-Guided Machine Learning). Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation. ## Glossary [optional] [More Information Needed] ## More Information [optional] [More Information Needed] ## Model Card Authors [optional] [More Information Needed] ## Model Card Contact [More Information Needed]