Krisshvamsi commited on
Commit
9f21d56
1 Parent(s): 5198208

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +20 -24
README.md CHANGED
@@ -28,28 +28,43 @@ The pre-trained model takes in input a short text and produces a spectrogram in
28
  ```
29
  pip install speechbrain
30
  ```
31
- ### Perform Text-to-Speech (TTS)
 
 
 
 
 
 
 
 
 
 
 
 
 
32
 
33
  ```python
34
  import torchaudio
35
  from TTSModel import TTSModel
36
- from Models import *
37
  from speechbrain.inference.vocoders import HIFIGAN
38
 
39
  texts = ["This is a sample text for synthesis."]
40
 
 
41
  # Intialize TTS (Transformer) and Vocoder (HiFIGAN)
42
- my_tts_model = TTSModel.from_hparams(source="model_source_path")
43
  hifi_gan = HIFIGAN.from_hparams(source="speechbrain/tts-hifigan-ljspeech", savedir="tmpdir_vocoder")
44
 
45
  # Running the TTS
46
- mel_output, mel_length = my_tts_model.encode_text(texts)
47
 
48
  # Running Vocoder (spectrogram-to-waveform)
49
  waveforms = hifi_gan.decode_batch(mel_output)
50
 
51
  # Save the waverform
52
  torchaudio.save('example_TTS.wav',waveforms.squeeze(1), 22050)
 
53
  ```
54
 
55
  If you want to generate multiple sentences in one-shot, pass the sentences as items in a list.
@@ -58,26 +73,7 @@ If you want to generate multiple sentences in one-shot, pass the sentences as it
58
  ### Inference on GPU
59
  To perform inference on the GPU, add `run_opts={"device":"cuda"}` when calling the `from_hparams` method.
60
 
61
-
62
- ### Training
63
- The model was trained with SpeechBrain.
64
- To train it from scratch follow these steps:
65
- 1. Clone SpeechBrain:
66
- ```bash
67
- git clone https://github.com/speechbrain/speechbrain/
68
- ```
69
- 2. Install it:
70
- ```bash
71
- cd speechbrain
72
- pip install -r requirements.txt
73
- pip install -e .
74
- ```
75
- 3. Run Training:
76
- ```bash
77
- cd recipes/LJSpeech/TTS/tacotron2/
78
- python train.py --device=cuda:0 --max_grad_norm=1.0 --data_folder=/your_folder/LJSpeech-1.1 hparams/train.yaml
79
- ```
80
-
81
 
82
  ### Limitations
83
  The SpeechBrain team does not provide any warranty on the performance achieved by this model when used on other datasets.
 
28
  ```
29
  pip install speechbrain
30
  ```
31
+ ### Perform Text-to-Speech (TTS) - Running Inference
32
+ To run model inference pull the interface directory as shown in the cell below
33
+
34
+ Note: Run on T4-GPU for faster inference
35
+ ```
36
+ !pip install --upgrade --no-cache-dir gdown
37
+ !gdown 1oy8Y5zwkLel7diA63GNCD-6cfoBV4tq7
38
+ !unzip inference.zip
39
+ ```
40
+ ```python
41
+ %%capture
42
+ !pip install speechbrain
43
+ %cd inference
44
+ ```
45
 
46
  ```python
47
  import torchaudio
48
  from TTSModel import TTSModel
49
+ from IPython.display import Audio
50
  from speechbrain.inference.vocoders import HIFIGAN
51
 
52
  texts = ["This is a sample text for synthesis."]
53
 
54
+ model_source_path = "/content/inference"
55
  # Intialize TTS (Transformer) and Vocoder (HiFIGAN)
56
+ my_tts_model = TTSModel.from_hparams(source=model_source_path)
57
  hifi_gan = HIFIGAN.from_hparams(source="speechbrain/tts-hifigan-ljspeech", savedir="tmpdir_vocoder")
58
 
59
  # Running the TTS
60
+ mel_output = my_tts_model.encode_text(texts)
61
 
62
  # Running Vocoder (spectrogram-to-waveform)
63
  waveforms = hifi_gan.decode_batch(mel_output)
64
 
65
  # Save the waverform
66
  torchaudio.save('example_TTS.wav',waveforms.squeeze(1), 22050)
67
+ print("Saved the audio file!")
68
  ```
69
 
70
  If you want to generate multiple sentences in one-shot, pass the sentences as items in a list.
 
73
  ### Inference on GPU
74
  To perform inference on the GPU, add `run_opts={"device":"cuda"}` when calling the `from_hparams` method.
75
 
76
+ Note: For Training the model please visit this [TTS_Training_Inference](https://colab.research.google.com/drive/1VYu4kXdgpv7f742QGquA1G4ipD2Kg0kT?usp=sharing) notebook
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
77
 
78
  ### Limitations
79
  The SpeechBrain team does not provide any warranty on the performance achieved by this model when used on other datasets.