--- license: apache-2.0 base_model: facebook/wav2vec2-xls-r-300m tags: - generated_from_trainer datasets: - ml-superb-subset metrics: - wer model-index: - name: amh_finetune results: - task: name: Automatic Speech Recognition type: automatic-speech-recognition dataset: name: ml-superb-subset type: ml-superb-subset config: amh split: test args: amh metrics: - name: Wer type: wer value: 97.41641337386018 --- # amh_finetune This model is a fine-tuned version of [facebook/wav2vec2-xls-r-300m](https://huggingface.co/facebook/wav2vec2-xls-r-300m) on the ml-superb-subset dataset. It achieves the following results on the evaluation set: - Loss: 2.8917 - Wer: 97.4164 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 0.001 - train_batch_size: 64 - eval_batch_size: 8 - seed: 42 - gradient_accumulation_steps: 2 - total_train_batch_size: 128 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: cosine - lr_scheduler_warmup_steps: 25 - training_steps: 500 - mixed_precision_training: Native AMP ### Training results | Training Loss | Epoch | Step | Validation Loss | Wer | |:-------------:|:--------:|:----:|:---------------:|:--------:| | 22.5796 | 2.2222 | 10 | 17.1583 | 100.0 | | 9.5568 | 4.4444 | 20 | 7.4797 | 100.0 | | 4.3875 | 6.6667 | 30 | 3.9841 | 100.0 | | 3.8631 | 8.8889 | 40 | 3.8281 | 100.0 | | 3.8298 | 11.1111 | 50 | 3.8117 | 100.0 | | 3.7925 | 13.3333 | 60 | 3.7866 | 100.0 | | 3.802 | 15.5556 | 70 | 3.7763 | 100.0 | | 3.7845 | 17.7778 | 80 | 3.7681 | 100.0 | | 3.7732 | 20.0 | 90 | 3.7627 | 100.0 | | 3.7547 | 22.2222 | 100 | 3.7625 | 100.0 | | 3.7471 | 24.4444 | 110 | 3.7588 | 100.0 | | 3.7378 | 26.6667 | 120 | 3.7244 | 100.0 | | 3.7278 | 28.8889 | 130 | 3.7337 | 100.0 | | 3.71 | 31.1111 | 140 | 3.7188 | 100.0 | | 3.6966 | 33.3333 | 150 | 3.7076 | 100.0 | | 3.6811 | 35.5556 | 160 | 3.6916 | 100.0 | | 3.6741 | 37.7778 | 170 | 3.6898 | 100.0 | | 3.6337 | 40.0 | 180 | 3.6486 | 100.0 | | 3.5766 | 42.2222 | 190 | 3.5913 | 100.0 | | 3.5251 | 44.4444 | 200 | 3.5318 | 100.0 | | 3.4533 | 46.6667 | 210 | 3.4549 | 100.0 | | 3.3664 | 48.8889 | 220 | 3.3877 | 100.0 | | 3.2963 | 51.1111 | 230 | 3.2852 | 100.0 | | 3.1237 | 53.3333 | 240 | 3.1187 | 100.0 | | 2.9356 | 55.5556 | 250 | 2.9620 | 100.0 | | 2.7107 | 57.7778 | 260 | 2.7665 | 100.0 | | 2.477 | 60.0 | 270 | 2.5155 | 99.3921 | | 2.1786 | 62.2222 | 280 | 2.2953 | 98.4043 | | 1.897 | 64.4444 | 290 | 2.1781 | 97.5684 | | 1.6863 | 66.6667 | 300 | 2.1825 | 97.5684 | | 1.4954 | 68.8889 | 310 | 2.1240 | 96.2766 | | 1.3132 | 71.1111 | 320 | 2.1476 | 94.3769 | | 1.1333 | 73.3333 | 330 | 2.2088 | 95.6687 | | 0.9827 | 75.5556 | 340 | 2.2591 | 94.9088 | | 0.9019 | 77.7778 | 350 | 2.4481 | 101.0638 | | 0.7936 | 80.0 | 360 | 2.5467 | 103.4195 | | 0.7015 | 82.2222 | 370 | 2.5279 | 95.5927 | | 0.631 | 84.4444 | 380 | 2.6338 | 95.8207 | | 0.5849 | 86.6667 | 390 | 2.6840 | 96.8085 | | 0.5549 | 88.8889 | 400 | 2.7048 | 97.4164 | | 0.5137 | 91.1111 | 410 | 2.7910 | 96.0486 | | 0.4905 | 93.3333 | 420 | 2.8070 | 98.7842 | | 0.4603 | 95.5556 | 430 | 2.8552 | 95.2888 | | 0.457 | 97.7778 | 440 | 2.8382 | 95.8207 | | 0.442 | 100.0 | 450 | 2.8831 | 98.2523 | | 0.4437 | 102.2222 | 460 | 2.8800 | 97.5684 | | 0.4346 | 104.4444 | 470 | 2.8805 | 97.7964 | | 0.4341 | 106.6667 | 480 | 2.8864 | 97.6444 | | 0.4319 | 108.8889 | 490 | 2.8911 | 97.3404 | | 0.4403 | 111.1111 | 500 | 2.8917 | 97.4164 | ### Framework versions - Transformers 4.41.0 - Pytorch 2.3.0+cu121 - Datasets 2.19.1 - Tokenizers 0.19.1