patrickvonplaten commited on
Commit
cc1150a
1 Parent(s): 50553ba

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -8,7 +8,7 @@ datasets:
8
  - imagenet-1k
9
  ---
10
 
11
- # Data2Vec-Vision (base-sized model, fine-tuned on ImageNet-1k)
12
 
13
  BEiT model pre-trained in a self-supervised fashion and fine-tuned on ImageNet-1k (1,2 million images, 1000 classes) at resolution 224x224. It was introduced in the paper [data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language](https://arxiv.org/abs/2202.03555) by Alexei Baevski, Wei-Ning Hsu, Qiantong Xu, Arun Babu, Jiatao Gu, Michael Auli and first released in [this repository](https://github.com/facebookresearch/data2vec_vision/tree/main/beit).
14
 
@@ -51,8 +51,8 @@ from PIL import Image
51
  import requests
52
  url = 'http://images.cocodataset.org/val2017/000000039769.jpg'
53
  image = Image.open(requests.get(url, stream=True).raw)
54
- feature_extractor = BeitFeatureExtractor.from_pretrained('facebook/data2vec-vision-base-ft1k')
55
- model = Data2VecVisionForImageClassification.from_pretrained('facebook/data2vec-vision-base-ft1k')
56
  inputs = feature_extractor(images=image, return_tensors="pt")
57
  outputs = model(**inputs)
58
  logits = outputs.logits
8
  - imagenet-1k
9
  ---
10
 
11
+ # Data2Vec-Vision (large-sized model, fine-tuned on ImageNet-1k)
12
 
13
  BEiT model pre-trained in a self-supervised fashion and fine-tuned on ImageNet-1k (1,2 million images, 1000 classes) at resolution 224x224. It was introduced in the paper [data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language](https://arxiv.org/abs/2202.03555) by Alexei Baevski, Wei-Ning Hsu, Qiantong Xu, Arun Babu, Jiatao Gu, Michael Auli and first released in [this repository](https://github.com/facebookresearch/data2vec_vision/tree/main/beit).
14
 
51
  import requests
52
  url = 'http://images.cocodataset.org/val2017/000000039769.jpg'
53
  image = Image.open(requests.get(url, stream=True).raw)
54
+ feature_extractor = BeitFeatureExtractor.from_pretrained('facebook/data2vec-vision-large-ft1k')
55
+ model = Data2VecVisionForImageClassification.from_pretrained('facebook/data2vec-vision-large-ft1k')
56
  inputs = feature_extractor(images=image, return_tensors="pt")
57
  outputs = model(**inputs)
58
  logits = outputs.logits