speechbrain
/

tts-fastspeech2-ljspeech

speech-synthesis

Model card Files Files and versions Community

small_fix

#5

by yingzhi - opened Jul 25, 2023

base: refs/heads/main

←

from: refs/pr/5

Discussion Files changed

Files changed (1) hide show

README.md +10 -6

README.md CHANGED Viewed

@@ -25,7 +25,7 @@ The pre-trained model takes texts or phonemes as input and produces a spectrogra
 ## Install SpeechBrain
-```
 git clone https://github.com/speechbrain/speechbrain.git
 cd speechbrain
 pip install -r requirements.txt
@@ -37,7 +37,7 @@ Please notice that we encourage you to read our tutorials and learn more about
 ### Perform Text-to-Speech (TTS) with FastSpeech2
-```
 import torchaudio
 from speechbrain.pretrained import FastSpeech2
 from speechbrain.pretrained import HIFIGAN
@@ -81,7 +81,7 @@ torchaudio.save('example_TTS_input_phoneme.wav', waveforms.squeeze(1), 22050)
 If you want to generate multiple sentences in one-shot, you can do in this way:
-```
 from speechbrain.pretrained import FastSpeech2
 fastspeech2 = FastSpeech2.from_hparams(source="speechbrain/tts-fastspeech2-ljspeech", savedir="tmpdir_tts")
 items = [
@@ -89,8 +89,12 @@ items = [
        "How much wood would a woodchuck chuck?",
        "Never odd or even"
      ]
-mel_outputs, durations, pitch, energy = fastspeech2.encode_text(items)
 ```
 ### Inference on GPU
@@ -114,7 +118,7 @@ pip install -e .
 cd recipes/LJSpeech/TTS/fastspeech2/
 python train.py --device=cuda:0 --max_grad_norm=1.0 --data_folder=/your_folder/LJSpeech-1.1 hparams/train.yaml
 ```
-You can find our training results (models, logs, etc) [here](https://drive.google.com/drive/folders/1Yb8CDCrW7JF1_jg8Xc4U15z3W37VjrY5?usp=share_link).
 ### Limitations
 The SpeechBrain team does not provide any warranty on the performance achieved by this model when used on other datasets.

 ## Install SpeechBrain
+```bash
 git clone https://github.com/speechbrain/speechbrain.git
 cd speechbrain
 pip install -r requirements.txt
 ### Perform Text-to-Speech (TTS) with FastSpeech2
+```python
 import torchaudio
 from speechbrain.pretrained import FastSpeech2
 from speechbrain.pretrained import HIFIGAN
 If you want to generate multiple sentences in one-shot, you can do in this way:
+```python
 from speechbrain.pretrained import FastSpeech2
 fastspeech2 = FastSpeech2.from_hparams(source="speechbrain/tts-fastspeech2-ljspeech", savedir="tmpdir_tts")
 items = [
        "How much wood would a woodchuck chuck?",
        "Never odd or even"
      ]
+mel_outputs, durations, pitch, energy = fastspeech2.encode_text(
+  items,
+  pace=1.0,        # scale up/down the speed
+  pitch_rate=1.0,  # scale up/down the pitch
+  energy_rate=1.0, # scale up/down the energy
+)
 ```
 ### Inference on GPU
 cd recipes/LJSpeech/TTS/fastspeech2/
 python train.py --device=cuda:0 --max_grad_norm=1.0 --data_folder=/your_folder/LJSpeech-1.1 hparams/train.yaml
 ```
+You can find our training results (models, logs, etc) [here](https://www.dropbox.com/sh/tqyp58ogejqfres/AAAtmq7cRoOR3XTsq0iSgyKBa?dl=0).
 ### Limitations
 The SpeechBrain team does not provide any warranty on the performance achieved by this model when used on other datasets.