|
--- |
|
library_name: keras |
|
--- |
|
|
|
This model is a TensorFlow port of DINO ViT B-16 [1]. The backbone of this model was pre-trained using the DINO pretext task. After that its head layer was trained |
|
by keeping the backbone frozen. ImageNet-1k was used for training purposes. You can refer to [this notebook](https://github.com/sayakpaul/probing-vits/blob/main/notebooks/load-dino-weights-vitb16.ipynb) to know how the porting was done. |
|
|
|
## References |
|
|
|
[1] Emerging Properties in Self-Supervised Vision Transformers: https://arxiv.org/abs/2104.14294 |