speecht5_tts / README.md
JBZhang2342's picture
Model save
e162d4a
metadata
license: mit
base_model: microsoft/speecht5_tts
tags:
  - generated_from_trainer
datasets:
  - common_voice_13_0
model-index:
  - name: speecht5_tts
    results: []

speecht5_tts

This model is a fine-tuned version of microsoft/speecht5_tts on the common_voice_13_0 dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5720

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 10
  • eval_batch_size: 10
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • training_steps: 20000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
1.0398 1.0 100 0.7784
0.8663 2.0 200 0.7059
0.7824 3.0 300 0.6734
0.7181 4.0 400 0.5776
0.6418 5.0 500 0.5584
0.6127 6.0 600 0.5452
0.5939 7.0 700 0.5386
0.5924 8.0 800 0.5420
0.5887 9.0 900 0.5392
0.5769 10.0 1000 0.5319
0.578 11.0 1100 0.5345
0.5799 12.0 1200 0.5257
0.5614 13.0 1300 0.5342
0.554 14.0 1400 0.5223
0.551 15.0 1500 0.5209
0.5587 16.0 1600 0.5221
0.5485 17.0 1700 0.5193
0.5354 18.0 1800 0.5216
0.5417 19.0 1900 0.5260
0.5319 20.0 2000 0.5218
0.5354 21.0 2100 0.5212
0.5316 22.0 2200 0.5233
0.5295 23.0 2300 0.5222
0.5407 24.0 2400 0.5317
0.5309 25.0 2500 0.5258
0.5196 26.0 2600 0.5317
0.5195 27.0 2700 0.5325
0.5134 28.0 2800 0.5193
0.5143 29.0 2900 0.5254
0.5227 30.0 3000 0.5260
0.5157 31.0 3100 0.5311
0.5214 32.0 3200 0.5292
0.5196 33.0 3300 0.5283
0.522 34.0 3400 0.5296
0.5193 35.0 3500 0.5252
0.5156 36.0 3600 0.5272
0.5182 37.0 3700 0.5318
0.5079 38.0 3800 0.5289
0.5103 39.0 3900 0.5374
0.5044 40.0 4000 0.5289
0.5021 41.0 4100 0.5372
0.5202 42.0 4200 0.5384
0.5022 43.0 4300 0.5281
0.498 44.0 4400 0.5327
0.4991 45.0 4500 0.5351
0.4972 46.0 4600 0.5383
0.5075 47.0 4700 0.5319
0.5063 48.0 4800 0.5365
0.4964 49.0 4900 0.5361
0.5021 50.0 5000 0.5353
0.4981 51.0 5100 0.5419
0.4914 52.0 5200 0.5398
0.5016 53.0 5300 0.5499
0.4911 54.0 5400 0.5484
0.5048 55.0 5500 0.5369
0.4828 56.0 5600 0.5452
0.4906 57.0 5700 0.5446
0.4922 58.0 5800 0.5451
0.4851 59.0 5900 0.5444
0.4898 60.0 6000 0.5461
0.4858 61.0 6100 0.5388
0.4966 62.0 6200 0.5408
0.4935 63.0 6300 0.5442
0.4824 64.0 6400 0.5466
0.4967 65.0 6500 0.5486
0.4789 66.0 6600 0.5429
0.481 67.0 6700 0.5516
0.4873 68.0 6800 0.5452
0.4816 69.0 6900 0.5497
0.4911 70.0 7000 0.5546
0.4805 71.0 7100 0.5460
0.4781 72.0 7200 0.5486
0.4923 73.0 7300 0.5479
0.4779 74.0 7400 0.5467
0.4778 75.0 7500 0.5513
0.4826 76.0 7600 0.5513
0.4756 77.0 7700 0.5509
0.4698 78.0 7800 0.5528
0.4868 79.0 7900 0.5559
0.478 80.0 8000 0.5523
0.472 81.0 8100 0.5570
0.4835 82.0 8200 0.5542
0.4813 83.0 8300 0.5538
0.472 84.0 8400 0.5503
0.4726 85.0 8500 0.5521
0.4804 86.0 8600 0.5577
0.4836 87.0 8700 0.5554
0.4786 88.0 8800 0.5603
0.471 89.0 8900 0.5597
0.4782 90.0 9000 0.5543
0.4713 91.0 9100 0.5549
0.4825 92.0 9200 0.5585
0.4749 93.0 9300 0.5598
0.4684 94.0 9400 0.5574
0.4732 95.0 9500 0.5577
0.4663 96.0 9600 0.5596
0.4618 97.0 9700 0.5555
0.4637 98.0 9800 0.5563
0.4731 99.0 9900 0.5578
0.485 100.0 10000 0.5591
0.475 101.0 10100 0.5598
0.4631 102.0 10200 0.5539
0.4636 103.0 10300 0.5567
0.4686 104.0 10400 0.5554
0.4677 105.0 10500 0.5530
0.4705 106.0 10600 0.5555
0.4596 107.0 10700 0.5567
0.4689 108.0 10800 0.5552
0.4698 109.0 10900 0.5591
0.4767 110.0 11000 0.5583
0.466 111.0 11100 0.5594
0.4792 112.0 11200 0.5604
0.4692 113.0 11300 0.5635
0.4675 114.0 11400 0.5597
0.467 115.0 11500 0.5587
0.4653 116.0 11600 0.5610
0.468 117.0 11700 0.5608
0.4649 118.0 11800 0.5625
0.4614 119.0 11900 0.5606
0.4663 120.0 12000 0.5626
0.4654 121.0 12100 0.5623
0.4582 122.0 12200 0.5613
0.4621 123.0 12300 0.5650
0.459 124.0 12400 0.5617
0.4538 125.0 12500 0.5609
0.4602 126.0 12600 0.5636
0.462 127.0 12700 0.5661
0.4647 128.0 12800 0.5585
0.4616 129.0 12900 0.5638
0.4691 130.0 13000 0.5658
0.4645 131.0 13100 0.5646
0.4581 132.0 13200 0.5638
0.4546 133.0 13300 0.5656
0.4633 134.0 13400 0.5651
0.4626 135.0 13500 0.5652
0.4663 136.0 13600 0.5657
0.4598 137.0 13700 0.5639
0.4711 138.0 13800 0.5650
0.4595 139.0 13900 0.5678
0.4586 140.0 14000 0.5638
0.4562 141.0 14100 0.5668
0.456 142.0 14200 0.5673
0.4561 143.0 14300 0.5694
0.4562 144.0 14400 0.5685
0.4583 145.0 14500 0.5642
0.446 146.0 14600 0.5690
0.4631 147.0 14700 0.5647
0.4553 148.0 14800 0.5673
0.4569 149.0 14900 0.5658
0.4618 150.0 15000 0.5645
0.4586 151.0 15100 0.5693
0.4474 152.0 15200 0.5683
0.4499 153.0 15300 0.5687
0.4533 154.0 15400 0.5687
0.452 155.0 15500 0.5693
0.4578 156.0 15600 0.5681
0.4534 157.0 15700 0.5697
0.4554 158.0 15800 0.5695
0.4532 159.0 15900 0.5728
0.4471 160.0 16000 0.5746
0.4528 161.0 16100 0.5715
0.4535 162.0 16200 0.5677
0.4487 163.0 16300 0.5719
0.4539 164.0 16400 0.5673
0.4493 165.0 16500 0.5722
0.4463 166.0 16600 0.5725
0.4547 167.0 16700 0.5693
0.4557 168.0 16800 0.5697
0.4548 169.0 16900 0.5727
0.4551 170.0 17000 0.5732
0.4633 171.0 17100 0.5725
0.4529 172.0 17200 0.5744
0.4542 173.0 17300 0.5745
0.4551 174.0 17400 0.5725
0.4562 175.0 17500 0.5724
0.4473 176.0 17600 0.5746
0.4491 177.0 17700 0.5714
0.4498 178.0 17800 0.5729
0.4612 179.0 17900 0.5704
0.4565 180.0 18000 0.5725
0.4571 181.0 18100 0.5716
0.4561 182.0 18200 0.5718
0.4542 183.0 18300 0.5726
0.4563 184.0 18400 0.5730
0.4517 185.0 18500 0.5726
0.449 186.0 18600 0.5715
0.4513 187.0 18700 0.5744
0.4487 188.0 18800 0.5769
0.4516 189.0 18900 0.5759
0.4524 190.0 19000 0.5741
0.4586 191.0 19100 0.5730
0.4507 192.0 19200 0.5748
0.4488 193.0 19300 0.5728
0.4635 194.0 19400 0.5739
0.4566 195.0 19500 0.5779
0.4556 196.0 19600 0.5745
0.4577 197.0 19700 0.5776
0.4481 198.0 19800 0.5746
0.4576 199.0 19900 0.5737
0.4523 200.0 20000 0.5720

Framework versions

  • Transformers 4.36.0.dev0
  • Pytorch 2.1.0+cu121
  • Datasets 2.14.6
  • Tokenizers 0.14.1