|
--- |
|
pipeline_tag: image-classification |
|
tags: |
|
- vision |
|
widget: |
|
- src: https://huggingface.co/datasets/mishig/sample_images/resolve/main/cat-dog-music.png |
|
example_title: Cat & Dog |
|
--- |
|
# Category Search from External Databases (CaSED) |
|
|
|
Disclaimer: The model card is taken and modified from the official repository, which can be found [here](https://github.com/altndrr/vic). The paper can be found [here](https://arxiv.org/abs/2306.00917). |
|
|
|
## Intended uses & limitations |
|
|
|
You can use the model for vocabulary-free image classification, i.e. classification with CLIP-like models without a pre-defined list of class names. |
|
|
|
## How to use |
|
|
|
Here is how to use this model: |
|
|
|
```python |
|
import requests |
|
from PIL import Image |
|
from transformers import AutoModel, CLIPProcessor |
|
|
|
# download an image from the internet |
|
url = "http://images.cocodataset.org/val2017/000000039769.jpg" |
|
image = Image.open(requests.get(url, stream=True).raw) |
|
|
|
# load the model and the processor |
|
model = AutoModel.from_pretrained("altndrr/cased", trust_remote_code=True) |
|
processor = CLIPProcessor.from_pretrained("openai/clip-vit-large-patch14") |
|
|
|
# get the model outputs |
|
images = processor(images=[image], return_tensors="pt", padding=True) |
|
outputs = model(images, alpha=0.5) |
|
labels, scores = outputs["vocabularies"][0], outputs["scores"][0] |
|
|
|
# print the top 5 most likely labels for the image |
|
values, indices = scores.topk(5) |
|
print("\nTop predictions:\n") |
|
for value, index in zip(values, indices): |
|
print(f"{labels[index]:>16s}: {100 * value.item():.2f}%") |
|
``` |
|
|
|
The model depends on some libraries you have to install manually before execution: |
|
|
|
```bash |
|
pip install torch faiss-cpu flair inflect nltk transformers |
|
``` |
|
|
|
## Citation |
|
|
|
```latex |
|
@misc{conti2023vocabularyfree, |
|
title={Vocabulary-free Image Classification}, |
|
author={Alessandro Conti and Enrico Fini and Massimiliano Mancini and Paolo Rota and Yiming Wang and Elisa Ricci}, |
|
year={2023}, |
|
eprint={2306.00917}, |
|
archivePrefix={arXiv}, |
|
primaryClass={cs.CV} |
|
} |
|
``` |