Running
🔊
Audio Classification Hub
Try out SpeechToolkit's Audio Classification models!
Applying machine learning to speech
Introducing SpeechToolkit, a unified framework for text-to-speech, voice conversion, automatic speech recognition, audio classification, and more!
pip install speechtoolkit[all]
With SpeechToolkit, text-to-speech is as easy as:
from speechtoolkit.tts import SingleSpeakerStyleTTS2Model
model = SingleSpeakerStyleTTS2Model()
model.infer_to_file('Hello, this is a test', 'out.wav')
And zero-shot voice conversion can be done in 3 lines of code:
from speechtoolkit.vc import LVC
vc = LVC(device='auto')
vc.infer_file('original.wav', 'sample.wav', 'out.wav')
Plus, our language and accent classifiers can also be easily used:
from speechtoolkit.classification.languageclassification import WhisperLanguageClassifierModel
lc = WhisperLanguageClassifierModel()
lc.infer_file('audio.wav')
from speechtoolkit.classification.accentclassification import EdAccAccentClassifierModel
ac = EdAccAccentClassifierModel()
ac.infer_file('audio.wav')