small_fix
#5
by
yingzhi
- opened
README.md
CHANGED
@@ -25,7 +25,7 @@ The pre-trained model takes texts or phonemes as input and produces a spectrogra
|
|
25 |
|
26 |
## Install SpeechBrain
|
27 |
|
28 |
-
```
|
29 |
git clone https://github.com/speechbrain/speechbrain.git
|
30 |
cd speechbrain
|
31 |
pip install -r requirements.txt
|
@@ -37,7 +37,7 @@ Please notice that we encourage you to read our tutorials and learn more about
|
|
37 |
|
38 |
### Perform Text-to-Speech (TTS) with FastSpeech2
|
39 |
|
40 |
-
```
|
41 |
import torchaudio
|
42 |
from speechbrain.pretrained import FastSpeech2
|
43 |
from speechbrain.pretrained import HIFIGAN
|
@@ -81,7 +81,7 @@ torchaudio.save('example_TTS_input_phoneme.wav', waveforms.squeeze(1), 22050)
|
|
81 |
|
82 |
If you want to generate multiple sentences in one-shot, you can do in this way:
|
83 |
|
84 |
-
```
|
85 |
from speechbrain.pretrained import FastSpeech2
|
86 |
fastspeech2 = FastSpeech2.from_hparams(source="speechbrain/tts-fastspeech2-ljspeech", savedir="tmpdir_tts")
|
87 |
items = [
|
@@ -89,8 +89,12 @@ items = [
|
|
89 |
"How much wood would a woodchuck chuck?",
|
90 |
"Never odd or even"
|
91 |
]
|
92 |
-
mel_outputs, durations, pitch, energy = fastspeech2.encode_text(
|
93 |
-
|
|
|
|
|
|
|
|
|
94 |
```
|
95 |
|
96 |
### Inference on GPU
|
@@ -114,7 +118,7 @@ pip install -e .
|
|
114 |
cd recipes/LJSpeech/TTS/fastspeech2/
|
115 |
python train.py --device=cuda:0 --max_grad_norm=1.0 --data_folder=/your_folder/LJSpeech-1.1 hparams/train.yaml
|
116 |
```
|
117 |
-
You can find our training results (models, logs, etc) [here](https://
|
118 |
|
119 |
### Limitations
|
120 |
The SpeechBrain team does not provide any warranty on the performance achieved by this model when used on other datasets.
|
|
|
25 |
|
26 |
## Install SpeechBrain
|
27 |
|
28 |
+
```bash
|
29 |
git clone https://github.com/speechbrain/speechbrain.git
|
30 |
cd speechbrain
|
31 |
pip install -r requirements.txt
|
|
|
37 |
|
38 |
### Perform Text-to-Speech (TTS) with FastSpeech2
|
39 |
|
40 |
+
```python
|
41 |
import torchaudio
|
42 |
from speechbrain.pretrained import FastSpeech2
|
43 |
from speechbrain.pretrained import HIFIGAN
|
|
|
81 |
|
82 |
If you want to generate multiple sentences in one-shot, you can do in this way:
|
83 |
|
84 |
+
```python
|
85 |
from speechbrain.pretrained import FastSpeech2
|
86 |
fastspeech2 = FastSpeech2.from_hparams(source="speechbrain/tts-fastspeech2-ljspeech", savedir="tmpdir_tts")
|
87 |
items = [
|
|
|
89 |
"How much wood would a woodchuck chuck?",
|
90 |
"Never odd or even"
|
91 |
]
|
92 |
+
mel_outputs, durations, pitch, energy = fastspeech2.encode_text(
|
93 |
+
items,
|
94 |
+
pace=1.0, # scale up/down the speed
|
95 |
+
pitch_rate=1.0, # scale up/down the pitch
|
96 |
+
energy_rate=1.0, # scale up/down the energy
|
97 |
+
)
|
98 |
```
|
99 |
|
100 |
### Inference on GPU
|
|
|
118 |
cd recipes/LJSpeech/TTS/fastspeech2/
|
119 |
python train.py --device=cuda:0 --max_grad_norm=1.0 --data_folder=/your_folder/LJSpeech-1.1 hparams/train.yaml
|
120 |
```
|
121 |
+
You can find our training results (models, logs, etc) [here](https://www.dropbox.com/sh/tqyp58ogejqfres/AAAtmq7cRoOR3XTsq0iSgyKBa?dl=0).
|
122 |
|
123 |
### Limitations
|
124 |
The SpeechBrain team does not provide any warranty on the performance achieved by this model when used on other datasets.
|