Model Card for DETR Finetuned on CPPE-5

Model Overview

This model is a fine-tuned version of facebook/detr-resnet-50 on a custom dataset, likely focused on detecting personal protective equipment (PPE) items. The fine-tuning has optimized the model to recognize various PPE elements such as face shields, masks, gloves, and goggles.

The model is based on the DEtection TRansformer (DETR) architecture, leveraging a ResNet-50 backbone for feature extraction. This fine-tuned version retains DETR's core functionality, enabling object detection tasks but is specifically adjusted to detect items relevant to occupational safety or PPE.

Model Performance

The model achieves the following metrics on its evaluation set:

  • Loss: 1.2294
  • mAP (mean Average Precision):
    • Overall: 0.2366
    • 50 IoU threshold: 0.4852
    • 75 IoU threshold: 0.2032
    • Small objects: 0.1082
    • Medium objects: 0.2086
    • Large objects: 0.3408
  • mAR (mean Average Recall):
    • At 1 detection: 0.2819
    • At 10 detections: 0.4463
    • At 100 detections: 0.4665
    • Small objects: 0.249
    • Medium objects: 0.4004
    • Large objects: 0.5893

For specific categories (face shields, gloves, goggles, masks), the precision and recall vary, with room for improvement, particularly for small objects like goggles.

Intended Use and Limitations

Intended Use

  • Detecting personal protective equipment (PPE) in images or video streams.
  • Monitoring workplace safety by ensuring proper usage of PPE items such as masks, gloves, face shields, and goggles.
  • Suitable for industries like construction, healthcare, and manufacturing where PPE detection is critical for compliance and safety.

Limitations

  • The model may not generalize well to non-PPE items or general object detection tasks.
  • Performance on small or occluded objects can be limited, as indicated by lower mAP and mAR scores for small objects.
  • The model was trained on a dataset specific to PPE detection, so its performance on images outside of this domain might be inconsistent.

Training and Evaluation Data

The dataset used for fine-tuning remains unspecified, but it appears to focus on personal protective equipment, such as face shields, masks, goggles, and gloves.

Training Procedure

Hyperparameters:

  • Learning rate: 5e-05
  • Train batch size: 8
  • Eval batch size: 8
  • Optimizer: Adam (betas=(0.9, 0.999), epsilon=1e-08)
  • Learning rate scheduler: Cosine decay
  • Number of epochs: 30
  • Seed: 42

The model was trained for 30 epochs with Adam optimization, using a learning rate of 5e-05 and cosine learning rate decay. The training was conducted with a batch size of 8 for both training and evaluation.

Evaluation Results

The following are performance metrics captured during the training process across multiple epochs:

Epoch Validation Loss mAP mAP 50 mAP 75 mAR Comments
1 2.1073 0.0518 0.1075 0.0423 0.2819 Initial training
5 1.6220 0.1223 0.2258 0.1115 0.4463 Significant improvement
10 1.5033 0.155 0.3265 0.1325 0.5032 Stable performance
20 1.2649 0.2211 0.4427 0.1952 0.5867 Peak performance
25 1.2347 0.2333 0.4831 0.1989 0.5966 Final metrics

Limitations and Ethical Considerations

Limitations:

  • Domain-specific: The model performs well in PPE-related object detection but may not generalize to other tasks.
  • Bias: If the dataset is skewed or limited, certain PPE items may be under-represented, leading to poorer performance for some categories.
  • Real-time Applications: The model might not meet the latency requirements for real-time detection in high-throughput environments.

Ethical Considerations:

  • Privacy: Using this model in surveillance scenarios (e.g., workplaces) may raise concerns about employee privacy, especially if applied without clear consent.
  • Misuse: Improper use of this model could lead to incorrect enforcement of safety regulations.

Future Work

  • Dataset Improvements: Expanding the dataset to include more diverse PPE items, environments, and object scales could improve model performance, especially for smaller objects.
  • Model Efficiency: Further fine-tuning or model distillation may help make the model more suitable for real-time applications.
Downloads last month
57
Safetensors
Model size
41.6M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for ashaduzzaman/detr_finetuned_cppe5

Finetuned
(454)
this model

Dataset used to train ashaduzzaman/detr_finetuned_cppe5