File size: 3,452 Bytes

9b8e25b

---
license: mit
base_model: microsoft/speecht5_tts
tags:
- generated_from_trainer
model-index:
- name: speecht5_finetuned_speaking_style_en
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# speecht5_finetuned_speaking_style_en

This model is a fine-tuned version of [microsoft/speecht5_tts](https://huggingface.co/microsoft/speecht5_tts) on the None dataset.
It achieves the following results on the evaluation set:
- Loss: 0.3277

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 8
- eval_batch_size: 2
- seed: 42
- gradient_accumulation_steps: 8
- total_train_batch_size: 64
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 500
- training_steps: 4000
- mixed_precision_training: Native AMP

### Training results

| Training Loss | Epoch | Step | Validation Loss |
|:-------------:|:-----:|:----:|:---------------:|
| 0.8232        | 0.61  | 100  | 0.5842          |
| 0.6949        | 1.23  | 200  | 0.4895          |
| 0.4918        | 1.84  | 300  | 0.3843          |
| 0.4266        | 2.45  | 400  | 0.3689          |
| 0.4098        | 3.07  | 500  | 0.3599          |
| 0.4026        | 3.68  | 600  | 0.3593          |
| 0.3947        | 4.29  | 700  | 0.3513          |
| 0.386         | 4.9   | 800  | 0.3481          |
| 0.3809        | 5.52  | 900  | 0.3457          |
| 0.3777        | 6.13  | 1000 | 0.3450          |
| 0.3745        | 6.74  | 1100 | 0.3418          |
| 0.3724        | 7.36  | 1200 | 0.3409          |
| 0.3697        | 7.97  | 1300 | 0.3404          |
| 0.3687        | 8.58  | 1400 | 0.3379          |
| 0.3684        | 9.2   | 1500 | 0.3373          |
| 0.3666        | 9.81  | 1600 | 0.3352          |
| 0.3637        | 10.42 | 1700 | 0.3395          |
| 0.3638        | 11.03 | 1800 | 0.3333          |
| 0.3594        | 11.65 | 1900 | 0.3333          |
| 0.3603        | 12.26 | 2000 | 0.3378          |
| 0.3592        | 12.87 | 2100 | 0.3316          |
| 0.3587        | 13.49 | 2200 | 0.3321          |
| 0.3557        | 14.1  | 2300 | 0.3311          |
| 0.3568        | 14.71 | 2400 | 0.3300          |
| 0.3595        | 15.33 | 2500 | 0.3291          |
| 0.3565        | 15.94 | 2600 | 0.3323          |
| 0.3549        | 16.55 | 2700 | 0.3305          |
| 0.3534        | 17.16 | 2800 | 0.3299          |
| 0.3545        | 17.78 | 2900 | 0.3268          |
| 0.3533        | 18.39 | 3000 | 0.3298          |
| 0.3529        | 19.0  | 3100 | 0.3306          |
| 0.3526        | 19.62 | 3200 | 0.3285          |
| 0.3513        | 20.23 | 3300 | 0.3274          |
| 0.3513        | 20.84 | 3400 | 0.3278          |
| 0.3505        | 21.46 | 3500 | 0.3295          |
| 0.3502        | 22.07 | 3600 | 0.3283          |
| 0.3505        | 22.68 | 3700 | 0.3295          |
| 0.3527        | 23.3  | 3800 | 0.3289          |
| 0.3518        | 23.91 | 3900 | 0.3275          |
| 0.3496        | 24.52 | 4000 | 0.3277          |


### Framework versions

- Transformers 4.38.2
- Pytorch 2.1.2
- Datasets 2.18.0
- Tokenizers 0.15.2