SpeechT5 TTS

This model is a fine-tuned version of microsoft/speecht5_tts on the SDA dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss
0.5703	1.49	1000	0.5289
0.541	2.98	2000	0.5131
0.5487	4.46	3000	0.5059
0.5232	5.95	4000	0.5011
0.5295	7.44	5000	0.4979
0.5257	8.93	6000	0.4970
0.5091	10.42	7000	0.4905
0.5141	11.9	8000	0.4893
0.5033	13.39	9000	0.4865
0.507	14.88	10000	0.4850
0.502	16.37	11000	0.4830
0.497	17.86	12000	0.4823
0.4974	19.35	13000	0.4801
0.4993	20.83	14000	0.4794
0.496	22.32	15000	0.4814
0.4845	23.81	16000	0.4780
0.4977	25.3	17000	0.4775
0.4888	26.79	18000	0.4780
0.4773	28.27	19000	0.4792
0.4914	29.76	20000	0.4817
0.4864	31.25	21000	0.4775
0.486	32.74	22000	0.4773
0.4884	34.23	23000	0.4835
0.4856	35.71	24000	0.4788
0.4814	37.2	25000	0.4811
0.4831	38.69	26000	0.4814
0.4732	40.18	27000	0.4816
0.4846	41.67	28000	0.4812
0.4731	43.15	29000	0.4843
0.4772	44.64	30000	0.4830
0.4793	46.13	31000	0.4834
0.4736	47.62	32000	0.4834
0.4798	49.11	33000	0.4826
0.4744	50.6	34000	0.4841
0.4784	52.08	35000	0.4844
0.4743	53.57	36000	0.4851
0.4779	55.06	37000	0.4854
0.4719	56.55	38000	0.4854
0.4825	58.04	39000	0.4856
0.4805	59.52	40000	0.4853