--- license: cc-by-nc-4.0 tags: - fassy - fassy pipeline_tag: text-to-speech --- # Shona Text-to-Speech This repository contains the **Shona (sna)** language text-to-speech (TTS) model checkpoint. ## Model Details # Model Description - **Developed by:** Fastino Mateteva - **Model type:** Text to Speech - **Language(s) (NLP):** Shona - **Finetuned from model:** SpeechT5 ## Usage ``` pip install --upgrade transformers accelerate ``` Then, run inference with the following code-snippet: ```python # Load model directly from transformers import AutoTokenizer, AutoModelForTextToWaveform tokenizer = AutoTokenizer.from_pretrained("Fastino06/ff") model = AutoModelForTextToWaveform.from_pretrained("Fastino06/ff") text = "some example text in the Shona language" inputs = tokenizer(text, return_tensors="pt") with torch.no_grad(): output = model(**inputs).waveform ``` The resulting waveform can be saved as a `.wav` file: ```python import scipy scipy.io.wavfile.write("fassy.wav", rate=model.config.sampling_rate, data=output) ``` Or displayed in a Jupyter Notebook / Google Colab: ```python from IPython.display import Audio Audio(output, rate=model.config.sampling_rate) ``` ## BibTex citation This model was developed by Fastino Mateteva .