FriedParrot's picture
Update README.md
d730a0a verified
metadata
library_name: transformers
license: creativeml-openrail-m
base_model:
  - facebook/detr-resnet-50-panoptic
datasets:
  - FriedParrot/a-large-scale-fish-dataset
language:
  - en

Model Card for Fish Segmentation (Fine-Tuned DETR)

This is a fine-tuned DETR model (facebook/detr-resnet-50-panoptic) adapted for fish detection and segmentation. The model performs multi-task prediction including:

  • Classification (fish species recognition)
  • Bounding Box prediction
  • Segmentation masks

It has 42.9M parameters and is trained on the A Large Scale Fish Dataset from Kaggle.

The copy of this dataset on hugging face is available here

Model Sources

This model is fully compatible with AutoModelForObjectDetection, AutoProcessor, and Hugging Face Trainer. Unlike the first model (fish-segmentation-model), this one does not require custom config classes.

Training Details

  • Hardware: NVIDIA RTX 4090 (48GB VRAM)
  • CUDA: 12.8
  • Framework: PyTorch + Hugging Face Transformers
  • Batch size: use 8 as train batch sizes
  • Training strategy: Direct fine-tuning of DETR with minimal modifications

Results & Example Predictions

Since its a fine-tuned model, the accuracy is really high, and also classification accuracy can reach about 100%.

The predicted bounding box and masks are also very accurate :

img