BVRA
/

MegaDescriptor-L-384

Image Classification

wildlife-datasets

re-identification

Model card Files Files and versions Community

picekl commited on Nov 12, 2023

Commit

4221c4c

•

1 Parent(s): 2ca7823

Create README.md

Files changed (1) hide show

README.md +55 -0

README.md ADDED Viewed

	@@ -0,0 +1,55 @@

+---
+tags:
+- image-classification
+- ecology
+- animals
+- re-identification
+library_name: wildlife-datasets
+license: cc-by-nc-4.0
+---
+# Model card for MegaDescriptor-B-224
+A Swin-L image feature model. Superwisely pre-trained on animal re-identification datasets.
+## Model Details
+- **Model Type:** Animal re-identification / feature backbone
+- **Model Stats:**
+  - Params (M): ??
+  - Image size: 384 x 384
+- **Papers:**
+  - Swin Transformer: Hierarchical Vision Transformer using Shifted Windows --> https://arxiv.org/abs/2103.14030
+- **Original:** ??
+- **Pretrain Dataset:** All available re-identification datasets --> TBD
+## Model Usage
+### Image Embeddings
+```python
+import timm
+import torch
+import torchvision.transforms as T
+from PIL import Image
+from urllib.request import urlopen
+model = timm.create_model("hf-hub:BVRA/wildlife-mega", pretrained=True)
+model = model.eval()
+train_transforms = T.Compose([T.Resize(224),
+                              T.ToTensor(),
+                              T.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5])])
+img = Image.open(urlopen(
+    'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png'
+))
+output = model(train_transforms(img).unsqueeze(0))  # output is (batch_size, num_features) shaped tensor
+# output is a (1, num_features) shaped tensor
+```
+## Citation
+```bibtex
+TBD
+```