Ganesh-KSV/vit-face-recognition-1

Model Details

Model type: Vision Transformer (ViT) for Image Classification

Finetuned from model : google/vit-base-patch16-384

Uses

Image classification based on facial features from the dataset.Link:https://www.kaggle.com/datasets/bhaveshmittal/celebrity-face-recognition-dataset

Downstream Use

Fine-tuning for other image classification tasks.

Transfer learning for related vision tasks.

Out-of-Scope Use

Tasks unrelated to image classification.

Sensitive applications without proper evaluation of biases and limitations.

Bias, Risks, and Limitations

Potential biases in the training dataset affecting model predictions.

Limitations in generalizability to different populations or image conditions not represented in the training data.

Risks associated with misclassification in critical applications.

Recommendations

Users (both direct and downstream) should be made aware of the risks, biases, and limitations of the model. It's recommended to evaluate the model's performance on the specific data before deploying it in a production environment

How to Get Started with the Model

Use the code below to get started with the model.

import torch

from transformers import ViTForImageClassification, ViTImageProcessor

model = ViTForImageClassification.from_pretrained("Ganesh-KSV/face-recognition-version1")

processor = ViTImageProcessor.from_pretrained("Ganesh-KSV/face-recognition-version1")

def predict(image):

    inputs = processor(images=image, return_tensors="pt")
    
    outputs = model(**inputs)
    
    logits = outputs.logits
    
    predictions = torch.argmax(logits, dim=-1)
    
    return predictions

Training Details

Training Data

Training Procedure:

Preprocessing :

Images were resized, augmented (rotation, color jitter, etc.), and normalized.

Training Hyperparameters:

Optimizer: Adam with learning rate 2e-5 and weight decay 1e-2

Scheduler: StepLR with step size 2 and gamma 0.5

Loss Function: CrossEntropyLoss

Epochs: 40

Batch Size: 4

Evaluation

Testing Data, Factors & Metrics

Testing Data

Validation split of the VGGFace dataset.

Factors

Performance evaluated based on loss and accuracy on the validation set.

Metrics

Loss and accuracy metrics for each epoch.

Results

Training and validation loss and accuracy plotted for 40 epochs.

Confusion matrix generated for the final validation results.

Summary

Model Examination

Model performance examined through loss, accuracy plots, and confusion matrix.

Glossary

ViT: Vision Transformer

CrossEntropyLoss: A loss function used for classification tasks.

Adam: An optimization algorithm.

StepLR: Learning rate scheduler that decays the learning rate by a factor every few epochs.