speechbrainteam commited on
Commit
61d287c
1 Parent(s): aad77c0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -1
README.md CHANGED
@@ -17,7 +17,7 @@ metrics:
17
 
18
 
19
  # Text-to-Speech (TTS) with Zero-Shot Multi-Speaker Tacotron2 trained on LibriTTS
20
- ### Note: This is a work in progress
21
 
22
  This repository provides all the necessary tools for Zero-Shot Multi-Speaker Text-to-Speech (TTS) with SpeechBrain using a variation of [Tacotron2](https://arxiv.org/abs/1712.05884), extended to incorporate speaker identity information when generating speech. It is pretrained on [LibriTTS](https://www.openslr.org/60/).
23
 
@@ -36,6 +36,9 @@ Please notice that we encourage you to read our tutorials and learn more about
36
 
37
  The following is an example of converting text-to-speech with the speaker voice characteristics extracted from reference speech.
38
 
 
 
 
39
  ```
40
  import torchaudio
41
  from speechbrain.pretrained import MSTacotron2
 
17
 
18
 
19
  # Text-to-Speech (TTS) with Zero-Shot Multi-Speaker Tacotron2 trained on LibriTTS
20
+ ### Note: This project is currently a work in progress. While the model is operational, we are now focusing on enhancing the quality of the generated voice
21
 
22
  This repository provides all the necessary tools for Zero-Shot Multi-Speaker Text-to-Speech (TTS) with SpeechBrain using a variation of [Tacotron2](https://arxiv.org/abs/1712.05884), extended to incorporate speaker identity information when generating speech. It is pretrained on [LibriTTS](https://www.openslr.org/60/).
23
 
 
36
 
37
  The following is an example of converting text-to-speech with the speaker voice characteristics extracted from reference speech.
38
 
39
+ **Note:**
40
+ - The model generates speech at a rate of 22050 Hz, but it's important to note that the input signal, crucial for capturing speaker identities, must be sampled at 16kHz.
41
+
42
  ```
43
  import torchaudio
44
  from speechbrain.pretrained import MSTacotron2