File size: 1,704 Bytes
f798255
 
 
 
 
 
b82d7ea
d03cb40
 
f798255
a7fa600
ad11d06
 
 
 
 
 
 
 
 
 
 
 
 
 
b82d7ea
 
f798255
 
b82d7ea
f798255
a7fa600
f798255
5f7455c
f798255
a7fa600
f798255
 
 
b82d7ea
f798255
 
 
 
 
 
 
b82d7ea
f798255
 
 
 
 
 
 
a7fa600
f798255
 
 
a7fa600
f798255
 
a7fa600
f798255
 
 
 
 
 
a7fa600
 
 
f798255
 
 
 
a7fa600
f798255
a7fa600
b82d7ea
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
---
license: mit
base_model: microsoft/speecht5_tts
tags:
- generated_from_trainer
datasets:
- facebook/voxpopuli
tasks:
- text-to-speech
model-index:
- name: speecht5_tts-ft-voxpopuli-it
  results: 
  - task:
      name: Text To Speech
      type: text-to-speech
    dataset:
      name: facebook/voxpopuli
      type: facebook/voxpopuli
      config: it
      split: train
      args: it
    metrics:
    - name: N.A.
      type: N.A.
      value: N.A.
language:
- it
---



# speecht5_tts-ft-voxpopuli-it

This model is a fine-tuned version of [microsoft/speecht5_tts](https://huggingface.co/microsoft/speecht5_tts) on the facebook/voxpopuli dataset.
It achieves the following results on the evaluation set:
- Loss: 0.5126

## Model description

It uses the speaker embedding model speechbrain/spkrec-xvect-voxceleb

## Intended uses & limitations

More information needed

## Training and evaluation data

test_size=0.15

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 8
- eval_batch_size: 2
- seed: 42
- gradient_accumulation_steps: 4
- total_train_batch_size: 32
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 300
- training_steps: 1000

### Training results

| Training Loss | Epoch | Step | Validation Loss |
|:-------------:|:-----:|:----:|:---------------:|
| 0.6118        | 1.94  | 300  | 0.5508          |
| 0.5729        | 3.89  | 600  | 0.5204          |
| 0.563         | 5.83  | 900  | 0.5126          |


### Framework versions

- Transformers 4.33.0
- Pytorch 1.12.1+cu116
- Datasets 2.14.4
- Tokenizers 0.12.1