Keras
TF-Keras
sayakpaul's picture
sayakpaul HF staff
Create README.md
9aee488
|
raw
history blame
612 Bytes
metadata
library_name: keras

This model is a TensorFlow port of ViT B-16 [1] trained with recipes from [2]. It was first pre-trained on ImageNet-21k and was then fine-tuned on the ImageNet-1k dataset. You can refer to this notebook to know how the porting was done.

References

[1] An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale: https://arxiv.org/abs/2010.11929 [2] How to train your ViT? Data, Augmentation, and Regularization in Vision Transformers: https://arxiv.org/abs/2106.10270