metadata
license: apache-2.0
tags:
- automatic-speech-recognition
- AI_Light_Dance.py
- generated_from_trainer
datasets:
- ai_light_dance
model-index:
- name: ai-light-dance_singing_ft_wav2vec2-large-lv60
results: []
ai-light-dance_singing_ft_wav2vec2-large-lv60
This model is a fine-tuned version of facebook/wav2vec2-large-lv60 on the AI_LIGHT_DANCE.PY - ONSET-SINGING dataset. It achieves the following results on the evaluation set:
- Loss: 0.4542
- Wer: 0.2088
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 3e-05
- train_batch_size: 4
- eval_batch_size: 4
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 500
- num_epochs: 10.0
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss | Wer |
---|---|---|---|---|
0.7432 | 1.0 | 4422 | 0.8939 | 0.6323 |
0.5484 | 2.0 | 8844 | 0.6393 | 0.3557 |
0.3919 | 3.0 | 13266 | 0.5315 | 0.2833 |
0.421 | 4.0 | 17688 | 0.5234 | 0.2522 |
0.3957 | 5.0 | 22110 | 0.5125 | 0.2247 |
0.3228 | 6.0 | 26532 | 0.4542 | 0.2088 |
0.346 | 7.0 | 30954 | 0.4673 | 0.1997 |
0.1637 | 8.0 | 35376 | 0.4583 | 0.1910 |
0.1508 | 9.0 | 39798 | 0.4623 | 0.1837 |
0.1564 | 10.0 | 44220 | 0.4717 | 0.1835 |
Framework versions
- Transformers 4.20.0.dev0
- Pytorch 1.11.0+cu113
- Datasets 2.2.2.dev0
- Tokenizers 0.12.1