gary109's picture
update model card README.md
f27b547
metadata
tags:
  - automatic-speech-recognition
  - gary109/AI_Light_Dance
  - generated_from_trainer
model-index:
  - name: ai-light-dance_singing5_ft_wav2vec2-large-xlsr-53-5gram-v4-2-1
    results: []

ai-light-dance_singing5_ft_wav2vec2-large-xlsr-53-5gram-v4-2-1

This model is a fine-tuned version of gary109/ai-light-dance_singing4_ft_wav2vec2-large-xlsr-53-5gram-v4-2-1 on the GARY109/AI_LIGHT_DANCE - ONSET-SINGING5 dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1732
  • Wer: 0.0831

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 4e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 50.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
0.4351 1.0 100 0.1948 0.0903
0.4381 2.0 200 0.1961 0.0930
0.441 3.0 300 0.1948 0.0957
0.453 4.0 400 0.1971 0.0905
0.4324 5.0 500 0.1823 0.0879
0.4561 6.0 600 0.1934 0.0893
0.4231 7.0 700 0.2088 0.0977
0.4339 8.0 800 0.1924 0.0856
0.4195 9.0 900 0.1835 0.0846
0.4162 10.0 1000 0.1869 0.0908
0.411 11.0 1100 0.1966 0.0950
0.4034 12.0 1200 0.1890 0.0879
0.4155 13.0 1300 0.1844 0.0915
0.4123 14.0 1400 0.1849 0.0891
0.4002 15.0 1500 0.1901 0.0902
0.3983 16.0 1600 0.1879 0.0865
0.3907 17.0 1700 0.1863 0.0856
0.3969 18.0 1800 0.1773 0.0836
0.3721 19.0 1900 0.1834 0.0890
0.3987 20.0 2000 0.1817 0.0852
0.3863 21.0 2100 0.1898 0.0914
0.4052 22.0 2200 0.1882 0.0857
0.3811 23.0 2300 0.1874 0.0856
0.3791 24.0 2400 0.1932 0.0885
0.3919 25.0 2500 0.1847 0.0815
0.3891 26.0 2600 0.1850 0.0852
0.3719 27.0 2700 0.1774 0.0820
0.3791 28.0 2800 0.1756 0.0825
0.3537 29.0 2900 0.1797 0.0844
0.361 30.0 3000 0.1818 0.0834
0.3619 31.0 3100 0.1747 0.0838
0.3626 32.0 3200 0.1773 0.0844
0.3632 33.0 3300 0.1775 0.0825
0.3666 34.0 3400 0.1835 0.0859
0.3581 35.0 3500 0.1859 0.0868
0.3665 36.0 3600 0.1741 0.0849
0.3495 37.0 3700 0.1790 0.0837
0.3509 38.0 3800 0.1782 0.0841
0.3621 39.0 3900 0.1759 0.0841
0.3415 40.0 4000 0.1796 0.0851
0.3508 41.0 4100 0.1777 0.0821
0.3493 42.0 4200 0.1758 0.0829
0.359 43.0 4300 0.1788 0.0848
0.3438 44.0 4400 0.1782 0.0836
0.3642 45.0 4500 0.1732 0.0831
0.3456 46.0 4600 0.1768 0.0823
0.3532 47.0 4700 0.1735 0.0834
0.3448 48.0 4800 0.1755 0.0827
0.3487 49.0 4900 0.1767 0.0833
0.3427 50.0 5000 0.1774 0.0836

Framework versions

  • Transformers 4.21.0.dev0
  • Pytorch 1.9.1+cu102
  • Datasets 2.3.3.dev0
  • Tokenizers 0.12.1