w2v-bert-2.0-nonstudio_and_studioRecords
This model is a fine-tuned version of facebook/w2v-bert-2.0 on an these datasets: IMASC, MSC, OpenSLR Malayalam Train split, Festvox Malayalam, common_voice_16_1
It achieves the following results on the evaluation set:
Training procedure
Trained on NVIDIA A100 GPU
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 16
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 32
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 500
- num_epochs: 10
- mixed_precision_training: Native AMP
Training results
Training Loss |
Epoch |
Step |
Validation Loss |
Wer |
1.1416 |
0.46 |
600 |
0.3393 |
0.4616 |
0.1734 |
0.92 |
1200 |
0.2414 |
0.3493 |
0.1254 |
1.38 |
1800 |
0.2205 |
0.2963 |
0.1097 |
1.84 |
2400 |
0.2157 |
0.3133 |
0.0923 |
2.3 |
3000 |
0.1854 |
0.2473 |
0.0792 |
2.76 |
3600 |
0.1939 |
0.2471 |
0.0696 |
3.22 |
4200 |
0.1720 |
0.2282 |
0.0589 |
3.68 |
4800 |
0.1768 |
0.2013 |
0.0552 |
4.14 |
5400 |
0.1635 |
0.1864 |
0.0437 |
4.6 |
6000 |
0.1501 |
0.1826 |
0.0408 |
5.06 |
6600 |
0.1500 |
0.1645 |
0.0314 |
5.52 |
7200 |
0.1559 |
0.1655 |
0.0317 |
5.98 |
7800 |
0.1448 |
0.1553 |
0.022 |
6.44 |
8400 |
0.1592 |
0.1590 |
0.0218 |
6.9 |
9000 |
0.1431 |
0.1458 |
0.0154 |
7.36 |
9600 |
0.1514 |
0.1366 |
0.0141 |
7.82 |
10200 |
0.1540 |
0.1383 |
0.0113 |
8.28 |
10800 |
0.1558 |
0.1391 |
0.0085 |
8.74 |
11400 |
0.1612 |
0.1356 |
0.0072 |
9.2 |
12000 |
0.1697 |
0.1289 |
0.0046 |
9.66 |
12600 |
0.1722 |
0.1299 |
Framework versions
- Transformers 4.39.3
- Pytorch 2.1.1+cu121
- Datasets 2.16.1
- Tokenizers 0.15.1