license: apache-2.0
language:
- en
tags:
- biology
- CV
- images
- animals
- image classification
- fine-grained classification
- butterflies
- birds
- interpretable
- transformers
- cross-attention
metrics: null
Model Card for INTR: A Simple Interpretable Transformer for Fine-grained Image Classification and Analysis
INTR checkpoint on CUB dataset with backbone DETR-R50
Model Details
Model Description
- Developed by: Dipanjyoti Paul, Arpita Chowdhury, Xinqi Xiong, Feng-Ju Chang, David Carlyn, Samuel Stevens, Kaiya Provost, Anuj Karpatne, Bryan Carstens, Daniel Rubenstein, Charles Stewart, Tanya Berger-Wolf, Yu Su, and Wei-Lun Chao
- Shared by [optional]: [More Information Needed]
- Model type: [More Information Needed]
- License: Apache 2.0
- Fine-tuned from model: DETR-R50
Model Sources
- Repository: Imageomics/INTR
- Paper: A Simple Interpretable Transformer for Fine-Grained Image Classification and Analysis
- Demo: Inference time single-image prediction and visualization notebook. Note that this is focused on the CUB dataset.
Uses
Direct Use
[More Information Needed]
Downstream Use [optional]
[More Information Needed]
Out-of-Scope Use
[More Information Needed]
Bias, Risks, and Limitations
[More Information Needed]
Recommendations
Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
How to Get Started with the Model
Use the code below to get started with the model.
[More Information Needed]
Training Details
Training Data
[More Information Needed]
Model | Dataset |
---|---|
CUB checkpoint download | CUB [More Information Needed] |
Bird checkpoint download | Birds 525 [More Information Needed] |
Butterfly checkpoint download | Cambridge Butterfly, images in the train folder |
Training Procedure
[More Information Needed]
Follow the below format for data.
datasets
βββ dataset_name
β βββ train
β β βββ class1
β β β βββ img1.jpeg
β β β βββ img2.jpeg
β β β βββ ...
β β βββ class2
β β β βββ img3.jpeg
β β β βββ ...
β β βββ ...
β βββ val
β βββ class1
β β βββ img4.jpeg
β β βββ img5.jpeg
β β βββ ...
β βββ class2
β β βββ img6.jpeg
β β βββ ...
β βββ ...
Preprocessing [optional]
[More Information Needed]
Training Hyperparameters
- Training regime: [More Information Needed]
Speeds, Sizes, Times [optional]
[More Information Needed]
Evaluation
To evaluate the performance of INTR on the CUB dataset, on a multi-GPU (e.g., 4 GPUs) settings, execute the below command. INTR checkpoints are available at Fine-tune model and results.
CUDA_VISIBLE_DEVICES=0,1,2,3 python -m torch.distributed.launch --nproc_per_node=4 --master_port 12345 --use_env main.py --eval --resume <path/to/intr_checkpoint_cub_detr_r50.pth> --dataset_path <path/to/datasets> --dataset_name <dataset_name>
Similarly, replace cub
in the name of the checkpoint with bird
or butterfly
to evaluate with the Birds 525 or Cambridge Butterfly checkpoint, respectively.
Testing Data, Factors & Metrics
Testing Data
[More Information Needed]
Model | Dataset |
---|---|
CUB checkpoint download | CUB [More Information Needed] |
Birds checkpoint download | Birds 525 [More Information Needed] |
Butterfly checkpoint download | Cambridge Butterfly, images in the val folder |
Factors
[More Information Needed]
Metrics
[More Information Needed]
Results
Summary
[More Information Needed]
Model Examination [optional]
[More Information Needed]
Environmental Impact
Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).
- Hardware Type: [More Information Needed]
- Hours used: [More Information Needed]
- Cloud Provider: [More Information Needed]
- Compute Region: [More Information Needed]
- Carbon Emitted: [More Information Needed]
Technical Specifications [optional]
Model Architecture and Objective
[More Information Needed]
Compute Infrastructure
[More Information Needed]
Hardware
[More Information Needed]
Software
[More Information Needed]
Citation
BibTeX:
If you find our work helpful for your research, please consider citing our paper as well.
@article{paul2023simple,
title={A Simple Interpretable Transformer for Fine-Grained Image Classification and Analysis},
author={Paul, Dipanjyoti and Chowdhury, Arpita and Xiong, Xinqi and Chang, Feng-Ju and Carlyn, David and Stevens, Samuel and Provost, Kaiya and Karpatne, Anuj and Carstens, Bryan and Rubenstein, Daniel and Stewart, Charles and Berger-Wolf, Tanya and Su, Yu and Chao, Wei-Lun},
journal={arXiv preprint arXiv:2311.04157},
year={2023}
}
Model Citation:
@software{Paul_A_Simple_Interpretable_2023,
author = {Paul, Dipanjyoti and Chowdhury, Arpita and Xiong, Xinqi and Chang, Feng-Ju and Carlyn, David and Stevens, Samuel and Provost, Kaiya and Karpatne, Anuj and Carstens, Bryan and Rubenstein, Daniel and Stewart, Charles and Berger-Wolf, Tanya and Su, Yu and Chao, Wei-Lun},
license = {Apache-2.0},
title = {{A Simple Interpretable Transformer for Fine-Grained Image Classification and Analysis}},
doi = {<doi once generated>},
url = {https://huggingface.co/imageomics/INTR},
version = {1.0.0},
month = sep,
year = {2023}
}
APA:
Paper:
Paul, D., Chowdhury, A., Xiong, X., Chang, F., Carlyn, D., Stevens, S., Provost, K., Karpatne, A., Carstens, B., Rubenstein, D., Stewart, C., Berger-Wolf, T., Su, Y., & Chao, W. (2023). A Simple Interpretable Transformer for Fine-Grained Image Classification and Analysis. arXiv. https://doi.org/10.48550/arXiv.2311.04157.
Model Citation:
Paul, D., Chowdhury, A., Xiong, X., Chang, F., Carlyn, D., Stevens, S., Provost, K., Karpatne, A., Carstens, B., Rubenstein, D., Stewart, C., Berger-Wolf, T., Su, Y., & Chao, W. (2023). A Simple Interpretable Transformer for Fine-Grained Image Classification and Analysis (Version 1.0.0).
Acknowledgements
Our model is inspired by the DEtection TRansformer (DETR) method.
We thank the authors of DETR for doing such great work.
The Imageomics Institute is funded by the US National Science Foundation's Harnessing the Data Revolution (HDR) program under Award #2118240 (Imageomics: A New Frontier of Biological Information Powered by Knowledge-Guided Machine Learning). Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.
Glossary [optional]
[More Information Needed]
More Information [optional]
[More Information Needed]
Model Card Authors [optional]
[More Information Needed]
Model Card Contact
[More Information Needed]