GatinhoEducado's picture
Update README.md
323d8a8 verified
metadata
library_name: transformers
language:
  - pt
license: mit
base_model: microsoft/speechT5_tts
tags:
  - generated_from_trainer
datasets:
  - ylacombe/cml-tts
model-index:
  - name: speechT5_tts-finetuned-cml-tts2
    results: []
pipeline_tag: text-to-speech

speechT5_tts-finetuned-cml-tts2

This model is a fine-tuned version of microsoft/speechT5_tts on the cml-tts dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4595

Model description

SpeechT5 model trained for Audio course Unit 6 hands-on on Portugues language cml-tts2 dataset for 5 hours. Honestly it is not that good but definetly better then initial SpeechT5. More information here https://outleys.site/en/development/AI/hugface-audio-course-handson-unit-6-exercise/

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.99) and epsilon=1e-07 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • training_steps: 16000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
0.4819 0.0625 1000 0.5007
0.4364 0.125 2000 0.4965
0.4224 0.1875 3000 0.4841
0.4006 1.0473 4000 0.4782
0.3993 1.1098 5000 0.4728
0.3993 1.1723 6000 0.4687
0.389 2.032 7000 0.4684
0.3827 2.0945 8000 0.4665
0.3895 2.157 9000 0.4702
0.3829 3.0168 10000 0.4648
0.3717 3.0793 11000 0.4631
0.384 3.1418 12000 0.4627
0.3802 4.0015 13000 0.4601
0.3667 4.064 14000 0.4610
0.3757 4.1265 15000 0.4606
0.375 4.189 16000 0.4595

Framework versions

  • Transformers 4.46.2
  • Pytorch 2.5.1+cu121
  • Datasets 3.1.0
  • Tokenizers 0.20.3