dennisjooo's picture
Update README.md
9e21d9b
|
raw
history blame
6.47 kB
metadata
license: apache-2.0
base_model: google/vit-base-patch16-224-in21k
tags:
  - generated_from_trainer
datasets:
  - FastJobs/Visual_Emotional_Analysis
metrics:
  - accuracy
  - precision
  - f1
model-index:
  - name: emotion_classification
    results:
      - task:
          name: Image Classification
          type: image-classification
        dataset:
          name: FastJobs/Visual_Emotional_Analysis
          type: FastJobs/Visual_Emotional_Analysis
          config: default
          split: train
          args: default
        metrics:
          - name: Accuracy
            type: accuracy
            value: 0.63125
          - name: Precision
            type: precision
            value: 0.6430986797647803
          - name: F1
            type: f1
            value: 0.6224944698106615

Emotion Classification

This model is a fine-tuned version of google/vit-base-patch16-224-in21k on the FastJobs/Visual_Emotional_Analysis dataset.

In theory, the accuracy for a random guess on this dataset is 0.1429.

It achieves the following results on the evaluation set:

  • Loss: 1.1031
  • Accuracy: 0.6312
  • Precision: 0.6431
  • F1: 0.6225

Model description

The Vision Transformer base version trained on ImageNet-21K released by Google. Further details can be found on their repo.

Training and evaluation data

Data Split

Used a 4:1 ratio for training and development sets and a random seed of 42. Also used a seed of 42 for batching the data, completely unrelated lol.

Pre-processing Augmentation

The main pre-processing phase for both training and evaluation includes:

  • Bilinear interpolation to resize the image to (224, 224, 3) because it uses ImageNet images to train the original model
  • Normalizing images using a mean and standard deviation of [0.5, 0.5, 0.5] just like the original model

Other than the aforementioned pre-processing, the training set was augmented using:

  • Random horizontal & vertical flip
  • Color jitter
  • Random resized crop

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 64
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine_with_restarts
  • lr_scheduler_warmup_steps: 20
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss Accuracy Precision F1
2.0742 1.0 10 2.0533 0.1938 0.1942 0.1858
2.0081 2.0 20 1.8908 0.3438 0.3701 0.3368
1.7211 3.0 30 1.5199 0.5312 0.4821 0.4844
1.5641 4.0 40 1.4248 0.4875 0.5314 0.4532
1.3979 5.0 50 1.2973 0.5375 0.5162 0.5023
1.2997 6.0 60 1.2016 0.525 0.4828 0.4826
1.2348 7.0 70 1.1670 0.5875 0.6375 0.5941
1.1481 8.0 80 1.1292 0.6 0.6111 0.5961
1.079 9.0 90 1.1782 0.5188 0.5265 0.5005
0.9909 10.0 100 1.1115 0.5813 0.5892 0.5668
0.9662 11.0 110 1.1047 0.5938 0.6336 0.5723
0.8149 12.0 120 1.0944 0.5563 0.5648 0.5499
0.7661 13.0 130 1.0932 0.5625 0.5738 0.5499
0.7067 14.0 140 1.0787 0.6062 0.6318 0.6045
0.6708 15.0 150 1.1140 0.6188 0.6463 0.6134
0.6268 16.0 160 1.0875 0.5813 0.6016 0.5815
0.5473 17.0 170 1.1483 0.5938 0.6027 0.5844
0.5228 18.0 180 1.1031 0.6312 0.6431 0.6225
0.4805 19.0 190 1.1747 0.5813 0.6057 0.5848
0.4995 20.0 200 1.1865 0.6062 0.6062 0.5980
0.456 21.0 210 1.2619 0.6 0.6020 0.5843
0.4697 22.0 220 1.2476 0.5625 0.5804 0.5647
0.3656 23.0 230 1.3106 0.6125 0.6645 0.6130
0.394 24.0 240 1.3398 0.5437 0.5627 0.5460
0.35 25.0 250 1.3391 0.5938 0.5940 0.5860
0.3508 26.0 260 1.2846 0.575 0.6070 0.5821
0.3106 27.0 270 1.3495 0.575 0.6258 0.5663
0.3265 28.0 280 1.4450 0.5375 0.6512 0.5248
0.2806 29.0 290 1.5145 0.5188 0.5840 0.5151
0.3276 30.0 300 1.5207 0.5188 0.5741 0.5164
0.2932 31.0 310 1.3179 0.6312 0.6421 0.6298
0.3542 32.0 320 1.3720 0.5875 0.6157 0.5780
0.3321 33.0 330 1.4787 0.5625 0.6088 0.5714
0.2641 34.0 340 1.5468 0.5375 0.5817 0.5385
0.2432 35.0 350 1.4893 0.5687 0.6012 0.5538
0.275 36.0 360 1.4775 0.575 0.5827 0.5710
0.239 37.0 370 1.4812 0.575 0.6100 0.5739
0.2658 38.0 380 1.7335 0.5563 0.6547 0.5436
0.3026 39.0 390 1.5692 0.5875 0.6401 0.5854
0.1867 40.0 400 1.4908 0.5687 0.5921 0.5741
0.1931 41.0 410 1.6608 0.5375 0.5834 0.5396
0.2416 42.0 420 1.5172 0.5938 0.6259 0.5935
0.1943 43.0 430 1.5260 0.5437 0.5775 0.5498

Framework versions

  • Transformers 4.33.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.14.5
  • Tokenizers 0.13.3