File size: 3,106 Bytes
a9384d7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
# Tutorial For Nervous Beginners

## Installation

User friendly installation. Recommended only for synthesizing voice.

```bash
$ pip install TTS
```

Developer friendly installation.

```bash
$ git clone https://github.com/coqui-ai/TTS
$ cd TTS
$ pip install -e .
```

## Training a `tts` Model

A breakdown of a simple script that trains a GlowTTS model on the LJspeech dataset. See the comments for more details.

### Pure Python Way

0. Download your dataset.

    In this example, we download and use the LJSpeech dataset. Set the download directory based on your preferences.

    ```bash
    $ python -c 'from TTS.utils.downloaders import download_ljspeech; download_ljspeech("../recipes/ljspeech/");'
    ```

1. Define `train.py`.

    ```{literalinclude} ../../recipes/ljspeech/glow_tts/train_glowtts.py
    ```

2. Run the script.

    ```bash
    CUDA_VISIBLE_DEVICES=0 python train.py
    ```

    - Continue a previous run.

        ```bash
        CUDA_VISIBLE_DEVICES=0 python train.py --continue_path path/to/previous/run/folder/
        ```

    - Fine-tune a model.

        ```bash
        CUDA_VISIBLE_DEVICES=0 python train.py --restore_path path/to/model/checkpoint.pth
        ```

    - Run multi-gpu training.

        ```bash
        CUDA_VISIBLE_DEVICES=0,1,2 python -m trainer.distribute --script train.py
        ```

### CLI Way

We still support running training from CLI like in the old days. The same training run can also be started as follows.

1. Define your `config.json`

    ```json
    {
        "run_name": "my_run",
        "model": "glow_tts",
        "batch_size": 32,
        "eval_batch_size": 16,
        "num_loader_workers": 4,
        "num_eval_loader_workers": 4,
        "run_eval": true,
        "test_delay_epochs": -1,
        "epochs": 1000,
        "text_cleaner": "english_cleaners",
        "use_phonemes": false,
        "phoneme_language": "en-us",
        "phoneme_cache_path": "phoneme_cache",
        "print_step": 25,
        "print_eval": true,
        "mixed_precision": false,
        "output_path": "recipes/ljspeech/glow_tts/",
        "datasets":[{"formatter": "ljspeech", "meta_file_train":"metadata.csv", "path": "recipes/ljspeech/LJSpeech-1.1/"}]
    }
    ```

2. Start training.
    ```bash
    $ CUDA_VISIBLE_DEVICES="0" python TTS/bin/train_tts.py --config_path config.json
    ```

## Training a `vocoder` Model

```{literalinclude} ../../recipes/ljspeech/hifigan/train_hifigan.py
```

❗️ Note that you can also use ```train_vocoder.py``` as the ```tts``` models above.

## Synthesizing Speech

You can run `tts` and synthesize speech directly on the terminal.

```bash
$ tts -h # see the help
$ tts --list_models  # list the available models.
```

![cli.gif](https://github.com/coqui-ai/TTS/raw/main/images/tts_cli.gif)


You can call `tts-server` to start a local demo server that you can open it on
your favorite web browser and 🗣️.

```bash
$ tts-server -h # see the help
$ tts-server --list_models  # list the available models.
```
![server.gif](https://github.com/coqui-ai/TTS/raw/main/images/demo_server.gif)