metadata

license: apache-2.0
base_model: facebook/wav2vec2-large-xlsr-53
tags:
  - generated_from_trainer
datasets:
  - zeroth_korean
metrics:
  - wer
model-index:
  - name: wav2vec2-large-xlsr-53-fine-tune_korean_byAILAB2
    results:
      - task:
          name: Automatic Speech Recognition
          type: automatic-speech-recognition
        dataset:
          name: zeroth_korean
          type: zeroth_korean
          config: clean
          split: test
          args: clean
        metrics:
          - name: Wer
            type: wer
            value: 0.9067911459117602

wav2vec2-large-xlsr-53-fine-tune_korean_byAILAB2

This model is a fine-tuned version of facebook/wav2vec2-large-xlsr-53 on the zeroth_korean dataset. It achieves the following results on the evaluation set:

Loss: 1.4929
Wer: 0.9068

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0002
train_batch_size: 16
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 32
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Wer
No log	0.99	38	54.4059	1.0
No log	2.0	77	38.8388	1.0
No log	2.99	115	24.1740	1.0
No log	4.0	154	16.4733	1.0
No log	4.99	192	10.1900	1.0
No log	6.0	231	6.0076	1.0
No log	6.99	269	4.8990	1.0
No log	8.0	308	4.8442	1.0
No log	8.99	346	4.8284	1.0
No log	10.0	385	4.8316	1.0
16.886	10.99	423	4.8164	1.0
16.886	12.0	462	4.7815	1.0
16.886	12.99	500	4.7204	0.9989
16.886	14.0	539	4.6842	0.9989
16.886	14.99	577	4.6641	0.9994
16.886	16.0	616	4.6527	1.0
16.886	16.99	654	4.6745	0.9992
16.886	18.0	693	4.6591	1.0
16.886	18.99	731	4.6506	0.9997
16.886	20.0	770	4.6719	0.9967
4.4391	20.99	808	4.6067	0.9968
4.4391	22.0	847	4.5748	0.9968
4.4391	22.99	885	4.5166	0.9962
4.4391	24.0	924	4.3783	0.9926
4.4391	24.99	962	4.2711	0.9913
4.4391	26.0	1001	3.6515	1.0030
4.4391	26.99	1039	3.1057	1.0640
4.4391	28.0	1078	2.6593	1.0742
4.4391	28.99	1116	2.4071	1.0587
4.4391	30.0	1155	2.2041	1.0379
4.4391	30.99	1193	2.0495	1.0319
3.1722	32.0	1232	1.9754	1.0459
3.1722	32.99	1270	1.8658	0.9968
3.1722	34.0	1309	1.7887	0.9883
3.1722	34.99	1347	1.7560	0.9776
3.1722	36.0	1386	1.6987	0.9675
3.1722	36.99	1424	1.6513	0.9443
3.1722	38.0	1463	1.6187	0.9473
3.1722	38.99	1501	1.6210	0.9408
3.1722	40.0	1540	1.5957	0.9458
3.1722	40.99	1578	1.5673	0.9246
1.2364	42.0	1617	1.5748	0.9286
1.2364	42.99	1655	1.5333	0.9217
1.2364	44.0	1694	1.5138	0.9100
1.2364	44.99	1732	1.5244	0.9223
1.2364	46.0	1771	1.5041	0.9080
1.2364	46.99	1809	1.5151	0.9155
1.2364	48.0	1848	1.4955	0.9077
1.2364	48.99	1886	1.4924	0.9065
1.2364	49.35	1900	1.4929	0.9068

Framework versions

Transformers 4.33.2
Pytorch 1.12.1
Datasets 2.14.5
Tokenizers 0.13.3