speecht5_finetune_binisha

This model is a fine-tuned version of microsoft/speecht5_tts on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 4
eval_batch_size: 2
seed: 42
gradient_accumulation_steps: 8
total_train_batch_size: 32
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 100
training_steps: 1500
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss
0.6028	2.7586	100	0.5187
0.5195	5.5172	200	0.4851
0.5075	8.2759	300	0.4708
0.462	11.0345	400	0.4609
0.4429	13.7931	500	0.4294
0.4303	16.5517	600	0.4249
0.4172	19.3103	700	0.4184
0.402	22.0690	800	0.4077
0.3898	24.8276	900	0.3975
0.3966	27.5862	1000	0.4197
0.3773	30.3448	1100	0.3955
0.3658	33.1034	1200	0.3878
0.3644	35.8621	1300	0.3878
0.3622	38.6207	1400	0.3841
0.3671	41.3793	1500	0.3836