File size: 1,816 Bytes
6da702c
32dc49e
6da702c
8e24acc
 
 
788842e
8e24acc
788842e
8e24acc
788842e
 
8e24acc
788842e
 
8e24acc
788842e
8e24acc
 
fb47e84
 
 
8e24acc
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
---
license: cc-by-nc-sa-3.0
---

# KhanomTan TTS v1.0

KhanomTan TTS (ขนมตาล) is an open-source Thai text-to-speech model that supports multilingual speakers such as Thai, English, and others.

KhanomTan TTS is a YourTTS model trained on multilingual languages that supports Thai. We use Thai speech corpora, TSync 1* and TSync 2* [mbarnig/lb-de-fr-en-pt-12800-TTS-CORPUS](https://huggingface.co/datasets/mbarnig/lb-de-fr-en-pt-12800-TTS-CORPUS) to train the YourTTS model by using code from [the 🐸 Coqui-TTS](https://github.com/coqui-ai/TTS).

### Config
We use Thai characters to the graphemes config to training the model and use the Speaker Encoder model from [🐸 Coqui-TTS](https://github.com/coqui-ai/TTS/releases/tag/speaker_encoder_model).

### Dataset
We use Tsync 1 and Tsync 2 corpora, which are not complete datasets, and then add these to [mbarnig/lb-de-fr-en-pt-12800-TTS-CORPUS](https://huggingface.co/datasets/mbarnig/lb-de-fr-en-pt-12800-TTS-CORPUS) dataset.

### Trained the model
We use the 🐸 Coqui-TTS multilingual VITS-model recipe (version 0.7.1 or the commit id is d46fbc240ccf21797d42ac26cb27eb0b9f8d31c4) for training the model, and we use the speaker encoder model from [🐸 Coqui-TTS](https://github.com/coqui-ai/TTS/releases/tag/speaker_encoder_model) then we release the best model to public access.

- Model cards: [https://github.com/wannaphong/KhanomTan-TTS-v1.0](https://github.com/wannaphong/KhanomTan-TTS-v1.0)
- Dataset (Tsync 1 and Tsync 2 only): [https://huggingface.co/datasets/wannaphong/tsync1-2-yourtts](https://huggingface.co/datasets/wannaphong/tsync1-2-yourtts)
- GitHub: [https://github.com/wannaphong/KhanomTan-TTS-v1.0](https://github.com/wannaphong/KhanomTan-TTS-v1.0)

*Note: Those are not complete corpus. We can access the public corpus only.