ML for Speech's profile picture

ML for Speech

AI & ML interests

Applying machine learning to speech

Organization Card
About org cards

ML for Speech

Introducing SpeechToolkit, a unified framework for text-to-speech, voice conversion, automatic speech recognition, audio classification, and more!

Learn more...

pip install speechtoolkit[all]

Examples

With SpeechToolkit, text-to-speech is as easy as:

from speechtoolkit.tts import SingleSpeakerStyleTTS2Model

model = SingleSpeakerStyleTTS2Model()

model.infer_to_file('Hello, this is a test', 'out.wav')

And zero-shot voice conversion can be done in 3 lines of code:

from speechtoolkit.vc import LVC

vc = LVC(device='auto')

vc.infer_file('original.wav', 'sample.wav', 'out.wav')

Plus, our language and accent classifiers can also be easily used:

from speechtoolkit.classification.languageclassification import WhisperLanguageClassifierModel

lc = WhisperLanguageClassifierModel()

lc.infer_file('audio.wav')
from speechtoolkit.classification.accentclassification import EdAccAccentClassifierModel

ac = EdAccAccentClassifierModel()

ac.infer_file('audio.wav')