Mongolian VITS — My-Voice Fine-Tune
Speaker-adapted fine-tune of Bokhbat/mongolian-vits-tts:
the multi-speaker Mongolian VITS model with one new voice (speaker01) added,
without degrading the original Mongolian ability.
- Base: multi-speaker VITS, 78 Mongolian speakers
- This model: 79 speakers = original 78 (ids 0–77, unchanged) +
speaker01(id 78, the new voice) - Adaptation data: ~3.7 min (57 clips), single speaker
- Best checkpoint: epoch 93 / step 609 (eval-loss best, early-stopped at plateau)
- Sample rate: 22050 Hz
How Mongolian ability was protected (Strategy A)
- Original 78 speaker ids preserved; new voice appended as id 78 (so the speaker embedding table was expanded, not overwritten).
text_encoder(phonetics/text) andduration_predictor(rhythm/prosody) were frozen — the language model cannot drift on the small dataset.- Low LR
2e-5(base used2e-4) + eval-based best-model selection.
The original 78 voices still synthesize full natural Mongolian; speaker01
is the newly learned voice. Note: 3.7 min is very little data — speaker01
is recognizable but rough; more data would sharpen it.
Files
| File | Description |
|---|---|
best_model.pth |
Fine-tuned VITS checkpoint (79 speakers) |
config.json |
Coqui TTS config |
speakers.pth |
79-speaker name→id map (speaker01 = 78) |
tensorboard/ |
Fine-tune training curves |
ft_yourvoice_spk01.wav |
Sample: new voice (speaker01) |
ft_original_spk0053.wav |
Sample: an original voice (spk_0053), Mongolian-ability check |
Usage
from huggingface_hub import hf_hub_download
from TTS.utils.synthesizer import Synthesizer
repo = "Bokhbat/mongolian-vits-myvoice"
ckpt = hf_hub_download(repo, "best_model.pth")
cfg = hf_hub_download(repo, "config.json")
spk = hf_hub_download(repo, "speakers.pth")
syn = Synthesizer(ckpt, cfg, tts_speakers_file=spk, use_cuda=False)
# the new voice:
wav = syn.tts("Сайн байна уу?", speaker_name="speaker01")
syn.save_wav(wav, "myvoice.wav")
# an original Mongolian voice still works:
wav = syn.tts("Сайн байна уу?", speaker_name="spk_0053")
syn.save_wav(wav, "original.wav")
- Downloads last month
- 19