Edit model card

speecht5_finetuned_speaking_style_en

This model is a fine-tuned version of microsoft/speecht5_tts on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3277

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 8
  • eval_batch_size: 2
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • training_steps: 4000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
0.8232 0.61 100 0.5842
0.6949 1.23 200 0.4895
0.4918 1.84 300 0.3843
0.4266 2.45 400 0.3689
0.4098 3.07 500 0.3599
0.4026 3.68 600 0.3593
0.3947 4.29 700 0.3513
0.386 4.9 800 0.3481
0.3809 5.52 900 0.3457
0.3777 6.13 1000 0.3450
0.3745 6.74 1100 0.3418
0.3724 7.36 1200 0.3409
0.3697 7.97 1300 0.3404
0.3687 8.58 1400 0.3379
0.3684 9.2 1500 0.3373
0.3666 9.81 1600 0.3352
0.3637 10.42 1700 0.3395
0.3638 11.03 1800 0.3333
0.3594 11.65 1900 0.3333
0.3603 12.26 2000 0.3378
0.3592 12.87 2100 0.3316
0.3587 13.49 2200 0.3321
0.3557 14.1 2300 0.3311
0.3568 14.71 2400 0.3300
0.3595 15.33 2500 0.3291
0.3565 15.94 2600 0.3323
0.3549 16.55 2700 0.3305
0.3534 17.16 2800 0.3299
0.3545 17.78 2900 0.3268
0.3533 18.39 3000 0.3298
0.3529 19.0 3100 0.3306
0.3526 19.62 3200 0.3285
0.3513 20.23 3300 0.3274
0.3513 20.84 3400 0.3278
0.3505 21.46 3500 0.3295
0.3502 22.07 3600 0.3283
0.3505 22.68 3700 0.3295
0.3527 23.3 3800 0.3289
0.3518 23.91 3900 0.3275
0.3496 24.52 4000 0.3277

Framework versions

  • Transformers 4.38.2
  • Pytorch 2.1.2
  • Datasets 2.18.0
  • Tokenizers 0.15.2
Downloads last month
13
Safetensors
Model size
144M params
Tensor type
F32
·
Inference API
This model can be loaded on Inference API (serverless).

Finetuned from