oxford-pets-vit-from-scratch

Model type: Vision Transformer (ViT) for image classification, fine-tuned with knowledge distillation.

The model utilizes a Vision Transformer (ViT) architecture, specifically google/vit-base-patch16-224 as the teacher model and WinKawaks/vit-tiny-patch16-224 as the student model. The model is trained using the Hugging Face Trainer and ImageDistilTrainer classes. Knowledge distillation is employed to transfer knowledge from the teacher to the student model. The training parameters include batch size (48), learning rate (3e-4), and number of training epochs (10). Knowledge distillation temperature was set to 5 and lambda parameter (distillation loss weight) to 0.9. The model uses pixel values of images as input features extracted using the ViTImageProcessor.

Intended Use

The primary intended use of this model is for image classification of pet images (cats and dogs of various breeds). It can be used for educational purposes, demonstration of knowledge distillation, or as a starting point for further development in similar image classification tasks. The primary intended users are researchers, students, and machine learning enthusiasts interested in image classification and knowledge distillation techniques. This model is not intended for real-world production environments or critical applications. It should not be used for any tasks involving sensitive or personal information. It may not generalize well to images outside the Oxford Pets dataset.

Factors

The model is trained to classify images of cats and dogs of different breeds. Factors that might influence model performance include image quality, lighting conditions, pose of the animal, and background clutter. The model is evaluated on its accuracy in classifying images from the Oxford Pets dataset.

Metrics

The primary performance metric is accuracy, measured as the percentage of correctly classified images. No specific decision thresholds are applied for classification. The class with the highest probability is predicted. No specific variation approaches are employed in this model card. You could consider analyzing performance across different breeds or image characteristics to assess model robustness.

Evaluation Data

The Oxford Pets dataset pcuenq/oxford-pets is used for evaluation. The Oxford Pets dataset is a widely used benchmark for image classification, providing a diverse set of pet images for training and evaluation. Images are preprocessed using the ViTImageProcessor from Hugging Face Transformers. This includes resizing, normalization, and converting to the appropriate format for the ViT model.

Training Data

The training data is a subset of the Oxford Pets dataset. It is split into training, validation, and test sets. The training set consists of 80% of the original data, while the validation and test sets are 10% each. The dataset is balanced across different breeds.

Quantitative Analyses

The final accuracy of the student model is evaluated on the test set. Further analysis can be conducted to investigate performance variations across breeds or other image characteristics, but it is not included in this initial model card.

Ethical Considerations

The dataset is focused on common pet breeds, and may not represent the diversity of all animals. The model's predictions should be interpreted with caution, and it's important to avoid using it in ways that could perpetuate stereotypes or biases. It's important to ensure that the model is used for its intended purpose and not for any harmful or malicious applications.

Caveats and Recommendations

This model is limited to classifying images of cats and dogs from the Oxford Pets dataset. It may not generalize well to other types of images or objects.

Further evaluation and potential improvements could include:

Exploring different knowledge distillation techniques or hyperparameters.
Evaluating the model's performance on other datasets.
Investigating potential biases and addressing fairness concerns.
Deploying the model for real-world use with appropriate monitoring and safeguards.

KFrimps
/

oxford-pets-vit-from-scratch