File size: 2,352 Bytes
1b881b4 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 |
---
license: other
license_name: coqui-public-model-license
license_link: https://coqui.ai/cpml
library_name: coqui
pipeline_tag: text-to-speech
datasets:
- ylacombe/google-argentinian-spanish
language:
- es
---
# ⓍTTS 🇦🇷
ⓍTTS is a Voice generation model that lets you clone voices into different languages by using just a quick 6-second audio clip. There is no need for an excessive amount of training data that spans countless hours.
This model was trained by IdeaLab in [CITECCA](https://mapatecnologico.rionegro.gov.ar/detail/citecca-centro-interdisciplinario-de-telecomunicaciones-electronica-computacion-y-ciencia-aplicada-unrn), in the [Universidad Nacional de Rio Negro](https://www.unrn.edu.ar/home)
### Language
This model's Spanish language has been finetuned using [ylacombe's google argentinian spanish dataset](https://huggingface.co/datasets/ylacombe/google-argentinian-spanish) to archieve an argentinian accent.
### Training Parameters
```
batch_size=8,
grad_accum_steps=96,
batch_group_size=48,
eval_batch_size=8,
num_loader_workers=8,
eval_split_max_size=256,
optimizer="AdamW",
optimizer_wd_only_on_weights=True,
optimizer_params={"betas": [0.9, 0.96], "eps": 1e-8, "weight_decay": 1e-2},
lr=5e-06,
lr_scheduler="MultiStepLR",
lr_scheduler_params={"milestones": [50000 * 18, 150000 * 18, 300000 * 18], "gamma": 0.5, "last_epoch": -1},
```
### License
This model is licensed under [Coqui Public Model License](https://coqui.ai/cpml). There's a lot that goes into a license for generative models, and you can read more of [the origin story of CPML here](https://coqui.ai/blog/tts/cpml).
Using 🐸TTS Command line:
```console
tts --model_name /path/to/xtts/ \
--text "Che boludo, vamos a tomar unos mates." \
--speaker_wav /path/to/target/speaker.wav \
--language_idx es \
--use_cuda true
```
Using the model directly:
```python
from TTS.tts.configs.xtts_config import XttsConfig
from TTS.tts.models.xtts import Xtts
config = XttsConfig()
config.load_json("/path/to/xtts/config.json")
model = Xtts.init_from_config(config)
model.load_checkpoint(config, checkpoint_dir="/path/to/xtts/", eval=True)
model.cuda()
outputs = model.synthesize(
"Che boludo, vamos a tomar unos mates.",
config,
speaker_wav="/data/TTS-public/_refclips/3.wav",
gpt_cond_len=3,
language="es",
)
``` |