Shared Model Cards
Prerequisites of using
- This document is serving as a quick lookup table for the community training/finetuning result, with various language support.
- The models in this repository are open source and are based on voluntary contributions from contributors.
- The use of models must be conditioned on respect for the respective creators. The convenience brought comes from their efforts.
Welcome to share here
- Have a pretrained/finetuned result: model checkpoint (pruned best to facilitate inference, i.e. leave only
ema_model_state_dict
) and corresponding vocab file (for tokenization).
- Host a public huggingface model repository and upload the model related files.
- Make a pull request adding a model card to the current page, i.e.
src\f5_tts\infer\SHARED.md
.
Supported Languages
Multilingual
F5-TTS Base @ zh & en @ F5-TTS
Model: hf://SWivid/F5-TTS/F5TTS_Base/model_1200000.safetensors
Vocab: hf://SWivid/F5-TTS/F5TTS_Base/vocab.txt
Config: {"dim": 1024, "depth": 22, "heads": 16, "ff_mult": 2, "text_dim": 512, "conv_layers": 4}
Other infos, e.g. Author info, Github repo, Link to some sampled results, Usage instruction, Tutorial (Blog, Video, etc.) ...
English
Finnish
F5-TTS Base @ fi @ AsmoKoskinen
Model: hf://AsmoKoskinen/F5-TTS_Finnish_Model/model_common_voice_fi_vox_populi_fi_20241206.safetensors
Vocab: hf://AsmoKoskinen/F5-TTS_Finnish_Model/vocab.txt
Config: {"dim": 1024, "depth": 22, "heads": 16, "ff_mult": 2, "text_dim": 512, "conv_layers": 4}
French
F5-TTS Base @ fr @ RASPIAUDIO
Model: hf://RASPIAUDIO/F5-French-MixedSpeakers-reduced/model_last_reduced.pt
Vocab: hf://RASPIAUDIO/F5-French-MixedSpeakers-reduced/vocab.txt
Config: {"dim": 1024, "depth": 22, "heads": 16, "ff_mult": 2, "text_dim": 512, "conv_layers": 4}
Hindi
F5-TTS Small @ hi @ SPRINGLab
Model: hf://SPRINGLab/F5-Hindi-24KHz/model_2500000.safetensors
Vocab: hf://SPRINGLab/F5-Hindi-24KHz/vocab.txt
Config: {"dim": 768, "depth": 18, "heads": 12, "ff_mult": 2, "text_dim": 512, "conv_layers": 4}
Italian
F5-TTS Base @ it @ alien79
Model: hf://alien79/F5-TTS-italian/model_159600.safetensors
Vocab: hf://alien79/F5-TTS-italian/vocab.txt
Config: {"dim": 1024, "depth": 22, "heads": 16, "ff_mult": 2, "text_dim": 512, "conv_layers": 4}
Japanese
F5-TTS Base @ ja @ Jmica
Model: hf://Jmica/F5TTS/JA_8500000/model_8499660.pt
Vocab: hf://Jmica/F5TTS/JA_8500000/vocab_updated.txt
Config: {"dim": 1024, "depth": 22, "heads": 16, "ff_mult": 2, "text_dim": 512, "conv_layers": 4}
Mandarin
Spanish
F5-TTS Base @ es @ jpgallegoar
Model |
🤗Hugging Face |
Data (Hours) |
Model License |
F5-TTS Base |
ckpt & vocab |
Voxpopuli & Crowdsourced & TEDx, 218 hours |
cc0-1.0 |
- @jpgallegoar GitHub repo, Jupyter Notebook and Gradio usage for Spanish model.