Bangla TTS

The Bangla TTS was training mono(Female) speaker using Vit tts model. The paper is ViT-TTS: Visual Text-to-Speech with Scalable Diffusion Transformer we used the coqui-ai🐸-a toolkit for Bangla Text-to-Speech training as well as inference.

Contributions

Collect various Bangla datasets from the internet some data are collected from Mozilla common voice dataset and train the model.
we’ve developed the Bangla Vits TTS(text to speech) system that we trained and used for reading various Bangla
text with the highest performing State of the Art(SOTA) Bangla neural voice.

Dataset

The Bangla Text-to-Speech (TTS) Team at IIT Madras has curated a Bangla Speech corpus, which has been meticulously processed for research purposes. The dataset has been downsampled to 22050 and reformatted from the original IITM annotation style to the LJSpeech format. This refined dataset, tailored for Bangla TTS, is accompanied by the weight files of the best-trained models. Researchers are encouraged to cite the corresponding paper, available at Paper Link, when utilizing this dataset in their research endeavors. The provided dataset and model weights contribute to the advancement of Bangla TTS research and serve as a valuable resource for further investigations in the field. Dataset Link

bangla-speech-processing
/

bangla_tts_female

Bangla TTS

Contributions

Dataset

Evaluation

Inference

References :