Edit model card

speecht5_finetuned_voxpopuli_it

This model is a fine-tuned version of microsoft/speecht5_tts on the voxpopuli dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4968

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0005
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 25

Training results

Training Loss Epoch Step Validation Loss
0.6707 1.0 108 0.5946
0.6625 2.0 217 0.6029
0.708 3.0 325 0.6118
0.6588 4.0 434 0.7109
0.6614 5.0 542 0.5799
0.6375 6.0 651 0.5714
0.619 7.0 759 0.5699
0.5806 8.0 868 0.5538
0.6024 9.0 976 0.5856
0.5728 10.0 1085 0.5446
0.5624 11.0 1193 0.5508
0.5711 12.0 1302 0.5376
0.5438 13.0 1410 0.5300
0.5308 14.0 1519 0.5206
0.5536 15.0 1627 0.5359
0.5285 16.0 1736 0.5264
0.525 17.0 1844 0.5108
0.4961 18.0 1953 0.5116
0.5111 19.0 2061 0.5042
0.4869 20.0 2170 0.5050
0.4864 21.0 2278 0.4994
0.4794 22.0 2387 0.5039
0.4787 23.0 2495 0.4975
0.4692 24.0 2604 0.4961
0.4656 24.88 2700 0.4968

Framework versions

  • Transformers 4.30.2
  • Pytorch 2.0.0
  • Datasets 2.1.0
  • Tokenizers 0.13.3
Downloads last month
3