Update README.md
Browse files
README.md
CHANGED
@@ -1,11 +1,11 @@
|
|
1 |
### The MelGAN vocoder for StyleSpeech
|
2 |
#### About StyleSpeech
|
3 |
* StyleSpeech or Meta-StyleSpeech is a model for Multi-Speaker Adaptive Text-to-Speech Generation
|
4 |
-
* The StyleSpeech model can be trained by official implementation
|
5 |
#### About MelGAN vocoder
|
6 |
* This MelGAN vocoder is used to transform the mel-spectrogram back to the waveform.
|
7 |
* StyleSpeech is based on 16k Hz sampling rate, and there is no available 16k Hz multi-speaker vocoder.
|
8 |
-
* Thus I train this vocoder from scratch using Libri-TTS train-100 hour dataset. The training pipeline is the same as the official MelGAN
|
9 |
* The synthesized sounds are close to the official demo with good quality.
|
10 |
#### Training Details
|
11 |
* GPU: RTX 2080Ti
|
|
|
1 |
### The MelGAN vocoder for StyleSpeech
|
2 |
#### About StyleSpeech
|
3 |
* StyleSpeech or Meta-StyleSpeech is a model for Multi-Speaker Adaptive Text-to-Speech Generation
|
4 |
+
* The StyleSpeech model can be trained by official implementation (https://github.com/KevinMIN95/StyleSpeech).
|
5 |
#### About MelGAN vocoder
|
6 |
* This MelGAN vocoder is used to transform the mel-spectrogram back to the waveform.
|
7 |
* StyleSpeech is based on 16k Hz sampling rate, and there is no available 16k Hz multi-speaker vocoder.
|
8 |
+
* Thus I train this vocoder from scratch using Libri-TTS train-100 hour dataset. The training pipeline is the same as the official MelGAN (https://github.com/descriptinc/melgan-neurips).
|
9 |
* The synthesized sounds are close to the official demo with good quality.
|
10 |
#### Training Details
|
11 |
* GPU: RTX 2080Ti
|