metadata

tags:
  - dino
  - vision
datasets:
  - imagenet-1k

DINO ResNet-50

ResNet-50 pretrained with DINO. DINO was introduced in Emerging Properties in Self-Supervised Vision Transformers, while ResNet was introduced in Deep Residual Learning for Image Recognition. The official implementation of a DINO Resnet-50 can be found here.

Weights converted from the official DINO ResNet using this script.

For up-to-date model card information, please see the original repo.

How to use

Warning: The feature extractor in this repo is a copy of the one from microsoft/resnet-50. We never verified if this image prerprocessing is the one used with DINO ResNet-50.

from transformers import AutoFeatureExtractor, ResNetModel
from PIL import Image
import requests

url = 'http://images.cocodataset.org/val2017/000000039769.jpg'
image = Image.open(requests.get(url, stream=True).raw)

feature_extractor = AutoFeatureExtractor.from_pretrained('Ramos-Ramos/dino-resnet-50')
model = ResNetModel.from_pretrained('Ramos-Ramos/dino-resnet-50')
inputs = feature_extractor(images=image, return_tensors="pt")
outputs = model(**inputs)
last_hidden_states = outputs.last_hidden_state

BibTeX entry and citation info

@article{DBLP:journals/corr/abs-2104-14294,
  author    = {Mathilde Caron and
               Hugo Touvron and
               Ishan Misra and
               Herv{\'{e}} J{\'{e}}gou and
               Julien Mairal and
               Piotr Bojanowski and
               Armand Joulin},
  title     = {Emerging Properties in Self-Supervised Vision Transformers},
  journal   = {CoRR},
  volume    = {abs/2104.14294},
  year      = {2021},
  url       = {https://arxiv.org/abs/2104.14294},
  archivePrefix = {arXiv},
  eprint    = {2104.14294},
  timestamp = {Tue, 04 May 2021 15:12:43 +0200},
  biburl    = {https://dblp.org/rec/journals/corr/abs-2104-14294.bib},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}

@inproceedings{he2016deep,
  title={Deep residual learning for image recognition},
  author={He, Kaiming and Zhang, Xiangyu and Ren, Shaoqing and Sun, Jian},
  booktitle={Proceedings of the IEEE conference on computer vision and pattern recognition},
  pages={770--778},
  year={2016}
}