speechbrain
/

hifigan-hubert-l1-3-7-12-18-23-k1000-LibriTTS

speech-synthesis

Model card Files Files and versions Community

chaanks commited on Jul 17, 2024

Commit

2e0d987

·

verified ·

1 Parent(s): ada410f

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -19,8 +19,8 @@ datasets:
 This repository provides all the necessary tools for using a [scalable HiFiGAN Unit](https://arxiv.org/abs/2406.10735) vocoder trained with [LibriTTS](https://www.openslr.org/141/).
-The pre-trained model take as input discrete self-supervised representations and produces a waveform as output. Typically, this model is utilized on top of a speech-to-unit translation model that converts an input utterance from a source language into a sequence of discrete speech units in a target language.
-To generate the discrete self-supervised representations, we employ a K-means clustering model trained on HuBERT hidden layers, with `k=1000`.
 ## Install SpeechBrain

 This repository provides all the necessary tools for using a [scalable HiFiGAN Unit](https://arxiv.org/abs/2406.10735) vocoder trained with [LibriTTS](https://www.openslr.org/141/).
+The pre-trained model take as input discrete self-supervised representations and produces a waveform as output. This is suitable for a wide range of generative tasks such as speech enhancement, separation, text-to-speech, voice cloning, etc. Please read [DASB - Discrete Audio and Speech Benchmark](https://arxiv.org/abs/2406.14294) for more information.
+To generate the discrete self-supervised representations, we employ a K-means clustering model trained using `facebook/hubert-large-ll60k` hidden layers, with k=1000.
 ## Install SpeechBrain