Spaces:

kingfroglao
/

animal-similarity

Sleeping

App Files Files Community

kingfroglao commited on Apr 4

Commit

9b3e244

1 Parent(s): 844a59d

add readme

Browse files

Files changed (2) hide show

README.md +65 -0
app.py +2 -2

README.md CHANGED Viewed

@@ -11,3 +11,68 @@ license: mit
 ---
 Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 ---
 Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
+# 🐾 Animal Species Similarity Comparison
+This Hugging Face Space allows users to upload two animal images and determine whether they are the same species. It combines object detection, classification, and visual embedding techniques using state-of-the-art models.
+---
+## How It Works
+This app performs the following tasks:
+1. **Object Detection** using [facebook/detr-resnet-50](https://huggingface.co/facebook/detr-resnet-50) to locate animals and draw bounding boxes.
+2. **Image Classification** using [google/vit-base-patch16-224](https://huggingface.co/google/vit-base-patch16-224) to predict the species label of each animal.
+3. **Visual Similarity Calculation** using:
+   - **ViT embeddings** for global semantic similarity
+   - **ResNet-50 embeddings** for local feature comparison
+   - **Label match indicator** based on top-1 classification
+A weighted fusion of all similarity scores is computed and interpreted to output a final decision.
+---
+## Example Use
+Upload two images of animals—cats, zebras, dogs, or wild animals—and get a prediction like:
+```
+ViT Similarity: 0.742
+ResNet Similarity: 0.815
+Label Match: 1.0
+Final Score: 0.762 → 🟡 Possibly same species
+```
+---
+## References
+- Carion, N., et al. (2020). ["End-to-End Object Detection with Transformers (DETR)"](https://arxiv.org/abs/2005.12872).
+- Dosovitskiy, A., et al. (2021). ["An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale (ViT)"](https://arxiv.org/abs/2010.11929).
+- Hugging Face Transformers: https://huggingface.co/docs/transformers
+- timm: PyTorch Image Models by Ross Wightman – https://github.com/huggingface/pytorch-image-models
+---
+## 🤖 Acknowledgment
+This project was built with the assistance of **ChatGPT** (OpenAI) to support code generation, explanation, formatting, and technical writing.
+---
+## Built With
+- Hugging Face Spaces + Transformers
+- PyTorch + TorchVision
+- Gradio UI
+- ViT + DETR + ResNet50
+---
+## Requirements
+```
+transformers>=4.36.0
+torch
+torchvision
+timm
+gradio
+Pillow
+```
+---

app.py CHANGED Viewed

@@ -12,8 +12,8 @@ def process(img1, img2):
     boxed_img2 = draw_detr_boxes(img2.copy())
     final_text = f"""
-    🌍 ViT Similarity: {result['vit_score']:.3f}
-    🔬 ResNet Similarity: {result['resnet_score']:.3f}
     📊 Label Match: {result['label_match']:.1f}
     ⭐ Final Score: {result['final_score']:.3f}

     boxed_img2 = draw_detr_boxes(img2.copy())
     final_text = f"""
+    📊 ViT Similarity: {result['vit_score']:.3f}
+    📊 ResNet Similarity: {result['resnet_score']:.3f}
     📊 Label Match: {result['label_match']:.1f}
     ⭐ Final Score: {result['final_score']:.3f}