SpeechT5 TTS English Accented

This model is a fine-tuned version of microsoft/speecht5_tts on the Common Voice dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss
No log	1.41	250	0.5448
0.6715	2.82	500	0.5147
0.6715	4.24	750	0.5225
0.5532	5.65	1000	0.5096
0.5532	7.06	1250	0.5293
0.5156	8.47	1500	0.5310
0.5156	9.89	1750	0.5417
0.4874	11.3	2000	0.5185
0.4874	12.71	2250	0.5112
0.4693	14.12	2500	0.5154
0.4693	15.54	2750	0.5148
0.4619	16.95	3000	0.5367
0.4619	18.36	3250	0.5207
0.447	19.77	3500	0.5318
0.447	21.19	3750	0.5286
0.4348	22.6	4000	0.5345
0.4348	24.01	4250	0.5362
0.4237	25.42	4500	0.5568
0.4237	26.84	4750	0.5352
0.4195	28.25	5000	0.5395
0.4195	29.66	5250	0.5487
0.4132	31.07	5500	0.5443
0.4132	32.49	5750	0.5491
0.3975	33.9	6000	0.5465
0.3975	35.31	6250	0.5505
0.396	36.72	6500	0.5450
0.396	38.14	6750	0.5510
0.3884	39.55	7000	0.5517
0.3884	40.96	7250	0.5685
0.383	42.37	7500	0.5622
0.383	43.79	7750	0.5659
0.3806	45.2	8000	0.5636
0.3806	46.61	8250	0.5681
0.3738	48.02	8500	0.5797
0.3738	49.44	8750	0.5741
0.3705	50.85	9000	0.5765
0.3705	52.26	9250	0.5770
0.364	53.67	9500	0.5854
0.364	55.08	9750	0.5806
0.36	56.5	10000	0.5854