Readme updates
Browse files
README.md
CHANGED
@@ -21,7 +21,9 @@ datasets:
|
|
21 |
# Aina Project's Catalan multi-speaker text-to-speech model
|
22 |
## Model description
|
23 |
|
24 |
-
This model was trained from scratch using the [Coqui TTS](https://github.com/coqui-ai/TTS) toolkit on a combination of 3 datasets: [Festcat](http://festcat.talp.cat/devel.php), [
|
|
|
|
|
25 |
|
26 |
## Intended uses and limitations
|
27 |
|
@@ -67,7 +69,7 @@ wavs = synthesizer.tts(text, speaker_idx)
|
|
67 |
## Training
|
68 |
### Training Procedure
|
69 |
### Data preparation
|
70 |
-
The data has been processed using the script [process_data.sh](https://huggingface.co/projecte-aina/tts-
|
71 |
|
72 |
### Hyperparameter
|
73 |
|
@@ -116,7 +118,7 @@ Copyright (c) 2022 Text Mining Unit at Barcelona Supercomputing Center
|
|
116 |
[Apache License, Version 2.0](https://www.apache.org/licenses/LICENSE-2.0)
|
117 |
|
118 |
### Funding
|
119 |
-
This work was funded by the [
|
120 |
|
121 |
|
122 |
## Disclaimer
|
|
|
21 |
# Aina Project's Catalan multi-speaker text-to-speech model
|
22 |
## Model description
|
23 |
|
24 |
+
This model was trained from scratch using the [Coqui TTS](https://github.com/coqui-ai/TTS) toolkit on a combination of 3 datasets: [Festcat](http://festcat.talp.cat/devel.php), high quality open speech dataset of [Google](http://openslr.org/69/) and [Common Voice v8](https://commonvoice.mozilla.org/ca). For the training, 101460 utterances consisting of 257 speakers were used, which corresponds to nearly 138 hours of speech.
|
25 |
+
|
26 |
+
A live inference demo can be found in our spaces, [here](https://huggingface.co/spaces/projecte-aina/tts-ca-coqui-vits-multispeaker).
|
27 |
|
28 |
## Intended uses and limitations
|
29 |
|
|
|
69 |
## Training
|
70 |
### Training Procedure
|
71 |
### Data preparation
|
72 |
+
The data has been processed using the script [process_data.sh](https://huggingface.co/projecte-aina/tts-ca-coqui-vits-multispeaker/blob/main/data_processing/process_data.sh), which reduces the sampling frequency of the audios, eliminates silences, adds padding and structures the data in the format accepted by the framework. You can find more information [here](https://huggingface.co/projecte-aina/tts-ca-coqui-vits-multispeaker/blob/main/data_processing/README.md).
|
73 |
|
74 |
### Hyperparameter
|
75 |
|
|
|
118 |
[Apache License, Version 2.0](https://www.apache.org/licenses/LICENSE-2.0)
|
119 |
|
120 |
### Funding
|
121 |
+
This work was funded by the [Generalitat de Catalunya](https://politiquesdigitals.gencat.cat/ca/inici/index.html#googtrans(ca|en) within the framework of [Projecte AINA](https://politiquesdigitals.gencat.cat/ca/economia/catalonia-ai/aina).
|
122 |
|
123 |
|
124 |
## Disclaimer
|