patrickvonplaten
commited on
Commit
•
cc1150a
1
Parent(s):
50553ba
Update README.md
Browse files
README.md
CHANGED
@@ -8,7 +8,7 @@ datasets:
|
|
8 |
- imagenet-1k
|
9 |
---
|
10 |
|
11 |
-
# Data2Vec-Vision (
|
12 |
|
13 |
BEiT model pre-trained in a self-supervised fashion and fine-tuned on ImageNet-1k (1,2 million images, 1000 classes) at resolution 224x224. It was introduced in the paper [data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language](https://arxiv.org/abs/2202.03555) by Alexei Baevski, Wei-Ning Hsu, Qiantong Xu, Arun Babu, Jiatao Gu, Michael Auli and first released in [this repository](https://github.com/facebookresearch/data2vec_vision/tree/main/beit).
|
14 |
|
@@ -51,8 +51,8 @@ from PIL import Image
|
|
51 |
import requests
|
52 |
url = 'http://images.cocodataset.org/val2017/000000039769.jpg'
|
53 |
image = Image.open(requests.get(url, stream=True).raw)
|
54 |
-
feature_extractor = BeitFeatureExtractor.from_pretrained('facebook/data2vec-vision-
|
55 |
-
model = Data2VecVisionForImageClassification.from_pretrained('facebook/data2vec-vision-
|
56 |
inputs = feature_extractor(images=image, return_tensors="pt")
|
57 |
outputs = model(**inputs)
|
58 |
logits = outputs.logits
|
8 |
- imagenet-1k
|
9 |
---
|
10 |
|
11 |
+
# Data2Vec-Vision (large-sized model, fine-tuned on ImageNet-1k)
|
12 |
|
13 |
BEiT model pre-trained in a self-supervised fashion and fine-tuned on ImageNet-1k (1,2 million images, 1000 classes) at resolution 224x224. It was introduced in the paper [data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language](https://arxiv.org/abs/2202.03555) by Alexei Baevski, Wei-Ning Hsu, Qiantong Xu, Arun Babu, Jiatao Gu, Michael Auli and first released in [this repository](https://github.com/facebookresearch/data2vec_vision/tree/main/beit).
|
14 |
|
51 |
import requests
|
52 |
url = 'http://images.cocodataset.org/val2017/000000039769.jpg'
|
53 |
image = Image.open(requests.get(url, stream=True).raw)
|
54 |
+
feature_extractor = BeitFeatureExtractor.from_pretrained('facebook/data2vec-vision-large-ft1k')
|
55 |
+
model = Data2VecVisionForImageClassification.from_pretrained('facebook/data2vec-vision-large-ft1k')
|
56 |
inputs = feature_extractor(images=image, return_tensors="pt")
|
57 |
outputs = model(**inputs)
|
58 |
logits = outputs.logits
|