metadata

license: mit
base_model: microsoft/speecht5_tts
tags:
  - generated_from_trainer
model-index:
  - name: speecht5_tts
    results: []

speecht5_tts

This model is a fine-tuned version of microsoft/speecht5_tts on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.7806

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 4
eval_batch_size: 4
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500
training_steps: 30000
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss
No log	0.53	250	0.8506
1.0736	1.06	500	0.8219
1.0736	1.6	750	0.7713
0.8607	2.13	1000	0.7947
0.8607	2.66	1250	0.7537
0.802	3.19	1500	0.7304
0.802	3.72	1750	0.7409
0.7627	4.26	2000	0.7282
0.7627	4.79	2250	0.7224
0.7442	5.32	2500	0.7132
0.7442	5.85	2750	0.7718
0.736	6.38	3000	0.7362
0.736	6.91	3250	0.7283
0.7234	7.45	3500	0.7377
0.7234	7.98	3750	0.7226
0.6968	8.51	4000	0.7285
0.6968	9.04	4250	0.7395
0.692	9.57	4500	0.7306
0.692	10.11	4750	0.7221
0.6807	10.64	5000	0.7349
0.6807	11.17	5250	0.7310
0.6702	11.7	5500	0.7391
0.6702	12.23	5750	0.7299
0.6559	12.77	6000	0.7277
0.6559	13.3	6250	0.7453
0.6511	13.83	6500	0.7303
0.6511	14.36	6750	0.7451
0.6335	14.89	7000	0.7209
0.6335	15.43	7250	0.7421
0.6282	15.96	7500	0.7277
0.6282	16.49	7750	0.7426
0.6286	17.02	8000	0.7724
0.6286	17.55	8250	0.7310
0.6164	18.09	8500	0.7414
0.6164	18.62	8750	0.7411
0.6029	19.15	9000	0.7466
0.6029	19.68	9250	0.7267
0.5986	20.21	9500	0.7593
0.5986	20.74	9750	0.7544
0.595	21.28	10000	0.7441
0.595	21.81	10250	0.7422
0.5905	22.34	10500	0.7399
0.5905	22.87	10750	0.7494
0.5792	23.4	11000	0.7311
0.5792	23.94	11250	0.7479
0.5774	24.47	11500	0.7615
0.5774	25.0	11750	0.7578
0.5684	25.53	12000	0.7603
0.5684	26.06	12250	0.7300
0.5621	26.6	12500	0.7385
0.5621	27.13	12750	0.7447
0.5666	27.66	13000	0.7400
0.5666	28.19	13250	0.7518
0.5525	28.72	13500	0.7462
0.5525	29.26	13750	0.7351
0.5471	29.79	14000	0.7673
0.5471	30.32	14250	0.7325
0.5449	30.85	14500	0.7455
0.5449	31.38	14750	0.7473
0.5349	31.91	15000	0.7549
0.5349	32.45	15250	0.7513
0.5345	32.98	15500	0.7472
0.5345	33.51	15750	0.7542
0.5285	34.04	16000	0.7513
0.5285	34.57	16250	0.7466
0.522	35.11	16500	0.7627
0.522	35.64	16750	0.7609
0.5209	36.17	17000	0.7616
0.5209	36.7	17250	0.7612
0.5151	37.23	17500	0.7601
0.5151	37.77	17750	0.7590
0.5088	38.3	18000	0.7568
0.5088	38.83	18250	0.7551
0.5105	39.36	18500	0.7688
0.5105	39.89	18750	0.7631
0.5046	40.43	19000	0.7654
0.5046	40.96	19250	0.7749
0.5029	41.49	19500	0.7617
0.5029	42.02	19750	0.7735
0.4969	42.55	20000	0.7763
0.4969	43.09	20250	0.7484
0.497	43.62	20500	0.7606
0.497	44.15	20750	0.7726
0.4889	44.68	21000	0.7564
0.4889	45.21	21250	0.7694
0.4842	45.74	21500	0.7639
0.4842	46.28	21750	0.7784
0.4829	46.81	22000	0.7817
0.4829	47.34	22250	0.7727
0.4772	47.87	22500	0.7661
0.4772	48.4	22750	0.7630
0.477	48.94	23000	0.7640
0.477	49.47	23250	0.7730
0.4766	50.0	23500	0.7708
0.4766	50.53	23750	0.7716
0.4717	51.06	24000	0.7670
0.4717	51.6	24250	0.7671
0.4686	52.13	24500	0.7711
0.4686	52.66	24750	0.7704
0.4685	53.19	25000	0.7775
0.4685	53.72	25250	0.7690
0.4635	54.26	25500	0.7839
0.4635	54.79	25750	0.7746
0.4617	55.32	26000	0.7738
0.4617	55.85	26250	0.7753
0.4549	56.38	26500	0.7830
0.4549	56.91	26750	0.7777
0.4564	57.45	27000	0.7758
0.4564	57.98	27250	0.7728
0.4546	58.51	27500	0.7772
0.4546	59.04	27750	0.7795
0.4511	59.57	28000	0.7754
0.4511	60.11	28250	0.7867
0.4467	60.64	28500	0.7838
0.4467	61.17	28750	0.7858
0.4512	61.7	29000	0.7758
0.4512	62.23	29250	0.7819
0.4497	62.77	29500	0.7871
0.4497	63.3	29750	0.7817
0.4463	63.83	30000	0.7806

Framework versions

Transformers 4.36.0.dev0
Pytorch 2.1.0+cu121
Datasets 2.15.0
Tokenizers 0.14.1