miosipof's picture
End of training
bc514d1 verified
|
raw
history blame
2.54 kB
metadata
license: mit
base_model: miosipof/speecht5_tts_voxpopuli_it_v2
tags:
  - generated_from_trainer
datasets:
  - audiofolder
model-index:
  - name: speecht5_tts_dysarthria_v1
    results: []

speecht5_tts_dysarthria_v1

This model is a fine-tuned version of miosipof/speecht5_tts_voxpopuli_it_v2 on the audiofolder dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5234

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-06
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • training_steps: 500
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
1.0113 0.7042 25 0.8442
0.8 1.4085 50 0.7084
0.7291 2.1127 75 0.6323
0.6698 2.8169 100 0.5875
0.6339 3.5211 125 0.5633
0.5747 4.2254 150 0.5552
0.5837 4.9296 175 0.5436
0.5882 5.6338 200 0.5417
0.5692 6.3380 225 0.5363
0.5577 7.0423 250 0.5340
0.5411 7.7465 275 0.5323
0.5551 8.4507 300 0.5301
0.5671 9.1549 325 0.5292
0.5313 9.8592 350 0.5254
0.5546 10.5634 375 0.5246
0.5283 11.2676 400 0.5231
0.5484 11.9718 425 0.5222
0.5251 12.6761 450 0.5222
0.5443 13.3803 475 0.5223
0.5357 14.0845 500 0.5234

Framework versions

  • Transformers 4.43.3
  • Pytorch 2.5.0+cu124
  • Datasets 3.0.1
  • Tokenizers 0.19.1