finetuned
This model is a fine-tuned version of facebook/wav2vec2-large-robust-ft-swbd-300h on the None dataset. It achieves the following results on the evaluation set:
- Loss: 1.0281
- Uar: 0.7318
- Acc: 0.7721
For the test set:
- UAR: 0.74
- ACC: 0.794
Model description
This model is to predict four emotion categories given and audio file. Labels are anger', 'happiness', 'sadness', 'neutral'. This wav2vec2-based model is known cannot detect 'happiness'.
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 4
- total_train_batch_size: 32
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 10
Training results
Training Loss | Epoch | Step | Validation Loss | Uar | Acc |
---|---|---|---|---|---|
No log | 0.15 | 1 | 1.3899 | 0.25 | 0.1985 |
No log | 0.31 | 2 | 1.3850 | 0.25 | 0.1985 |
No log | 0.46 | 3 | 1.3815 | 0.25 | 0.1985 |
No log | 0.62 | 4 | 1.3772 | 0.25 | 0.1985 |
No log | 0.77 | 5 | 1.3714 | 0.25 | 0.4044 |
No log | 0.92 | 6 | 1.3656 | 0.25 | 0.4044 |
1.4878 | 1.08 | 7 | 1.3610 | 0.25 | 0.4044 |
1.4878 | 1.23 | 8 | 1.3583 | 0.25 | 0.4044 |
1.4878 | 1.38 | 9 | 1.3549 | 0.25 | 0.4044 |
1.4878 | 1.54 | 10 | 1.3518 | 0.25 | 0.4044 |
1.4878 | 1.69 | 11 | 1.3491 | 0.25 | 0.4044 |
1.4878 | 1.85 | 12 | 1.3458 | 0.25 | 0.4044 |
1.4878 | 2.0 | 13 | 1.3425 | 0.25 | 0.4044 |
1.2316 | 2.15 | 14 | 1.3401 | 0.25 | 0.4044 |
1.2316 | 2.31 | 15 | 1.3380 | 0.25 | 0.4044 |
1.2316 | 2.46 | 16 | 1.3354 | 0.25 | 0.4044 |
1.2316 | 2.62 | 17 | 1.3326 | 0.25 | 0.4044 |
1.2316 | 2.77 | 18 | 1.3292 | 0.2778 | 0.4265 |
1.2316 | 2.92 | 19 | 1.3250 | 0.2963 | 0.4412 |
1.3835 | 3.08 | 20 | 1.3212 | 0.3519 | 0.4853 |
1.3835 | 3.23 | 21 | 1.3158 | 0.4029 | 0.5221 |
1.3835 | 3.38 | 22 | 1.3096 | 0.5047 | 0.6029 |
1.3835 | 3.54 | 23 | 1.3019 | 0.5695 | 0.6544 |
1.3835 | 3.69 | 24 | 1.2944 | 0.6485 | 0.7059 |
1.3835 | 3.85 | 25 | 1.2856 | 0.6534 | 0.6985 |
1.3835 | 4.0 | 26 | 1.2773 | 0.6768 | 0.7059 |
1.1038 | 4.15 | 27 | 1.2688 | 0.6540 | 0.6691 |
1.1038 | 4.31 | 28 | 1.2554 | 0.6404 | 0.6471 |
1.1038 | 4.46 | 29 | 1.2404 | 0.6359 | 0.6397 |
1.1038 | 4.62 | 30 | 1.2222 | 0.6586 | 0.6765 |
1.1038 | 4.77 | 31 | 1.2057 | 0.6631 | 0.6838 |
1.1038 | 4.92 | 32 | 1.1874 | 0.6769 | 0.6985 |
1.075 | 5.08 | 33 | 1.1624 | 0.6953 | 0.7206 |
1.075 | 5.23 | 34 | 1.1427 | 0.7182 | 0.75 |
1.075 | 5.38 | 35 | 1.1270 | 0.7182 | 0.75 |
1.075 | 5.54 | 36 | 1.1085 | 0.7227 | 0.7574 |
1.075 | 5.69 | 37 | 1.0982 | 0.7227 | 0.7574 |
1.075 | 5.85 | 38 | 1.0943 | 0.7227 | 0.7574 |
1.075 | 6.0 | 39 | 1.0930 | 0.7136 | 0.7426 |
0.7211 | 6.15 | 40 | 1.0903 | 0.7091 | 0.7353 |
0.7211 | 6.31 | 41 | 1.0858 | 0.7091 | 0.7353 |
0.7211 | 6.46 | 42 | 1.0816 | 0.7045 | 0.7279 |
0.7211 | 6.62 | 43 | 1.0734 | 0.7091 | 0.7353 |
0.7211 | 6.77 | 44 | 1.0617 | 0.7136 | 0.7426 |
0.7211 | 6.92 | 45 | 1.0536 | 0.7136 | 0.7426 |
0.6595 | 7.08 | 46 | 1.0450 | 0.7318 | 0.7721 |
0.6595 | 7.23 | 47 | 1.0370 | 0.7364 | 0.7794 |
0.6595 | 7.38 | 48 | 1.0323 | 0.7364 | 0.7794 |
0.6595 | 7.54 | 49 | 1.0301 | 0.7364 | 0.7794 |
0.6595 | 7.69 | 50 | 1.0307 | 0.7364 | 0.7794 |
0.6595 | 7.85 | 51 | 1.0302 | 0.7318 | 0.7721 |
0.6595 | 8.0 | 52 | 1.0307 | 0.7318 | 0.7721 |
0.5067 | 8.15 | 53 | 1.0317 | 0.7318 | 0.7721 |
0.5067 | 8.31 | 54 | 1.0324 | 0.7318 | 0.7721 |
0.5067 | 8.46 | 55 | 1.0324 | 0.7318 | 0.7721 |
0.5067 | 8.62 | 56 | 1.0326 | 0.7273 | 0.7647 |
0.5067 | 8.77 | 57 | 1.0315 | 0.7318 | 0.7721 |
0.5067 | 8.92 | 58 | 1.0297 | 0.7318 | 0.7721 |
0.5617 | 9.08 | 59 | 1.0287 | 0.7318 | 0.7721 |
0.5617 | 9.23 | 60 | 1.0281 | 0.7318 | 0.7721 |
Framework versions
- Transformers 4.32.0
- Pytorch 2.3.0+cu121
- Datasets 2.19.1
- Tokenizers 0.13.3
- Downloads last month
- 1
Model tree for Bagus/wav2vec2_swbd_emodb
Base model
facebook/wav2vec2-large-robust-ft-swbd-300h