Edit model card
YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

vit-Facial-Expression-Recognition

This model is a fine-tuned version of google/vit-base-patch16-224-in21k on the FER 2013,MMI Facial Expression Database, and AffectNet dataset datasets. It achieves the following results on the evaluation set:

  • Loss: 0.4503
  • Accuracy: 0.8434

Model description

The vit-face-expression model is a Vision Transformer fine-tuned for the task of facial emotion recognition.

It is trained on the FER2013, MMI facial Expression, and AffectNet datasets, which consist of facial images categorized into seven different emotions:

  • Angry
  • Disgust
  • Fear
  • Happy
  • Sad
  • Surprise
  • Neutral

Data Preprocessing

The input images are preprocessed before being fed into the model. The preprocessing steps include:

  • Resizing: Images are resized to the specified input size.
  • Normalization: Pixel values are normalized to a specific range.
  • Data Augmentation: Random transformations such as rotations, flips, and zooms are applied to augment the training dataset.

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3e-05
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 256
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 1000
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss Accuracy
1.3548 0.17 100 0.8024 0.7418
1.047 0.34 200 0.6823 0.7653
0.9398 0.51 300 0.6264 0.7827
0.8618 0.67 400 0.5857 0.7973
0.8363 0.84 500 0.5532 0.8104
0.8018 1.01 600 0.5279 0.8196
0.7567 1.18 700 0.5110 0.8248
0.7521 1.35 800 0.5080 0.8259
0.741 1.52 900 0.5002 0.8271
0.7229 1.69 1000 0.4967 0.8263
0.7157 1.85 1100 0.4876 0.8326
0.6868 2.02 1200 0.4836 0.8342
0.6605 2.19 1300 0.4711 0.8384
0.6449 2.36 1400 0.4608 0.8406
0.6085 2.53 1500 0.4503 0.8434
0.6178 2.7 1600 0.4434 0.8478
0.6166 2.87 1700 0.4420 0.8486

Framework versions

  • Transformers 4.36.0
  • Pytorch 2.0.0
  • Datasets 2.1.0
  • Tokenizers 0.15.0
Downloads last month
144
Safetensors
Model size
85.8M params
Tensor type
F32
ยท

Space using motheecreator/vit-Facial-Expression-Recognition 1