mrfakename's picture
Sync from GitHub repo
48c079f verified
|
raw
history blame
6.39 kB

Shared Model Cards

Prerequisites of using

  • This document is serving as a quick lookup table for the community training/finetuning result, with various language support.
  • The models in this repository are open source and are based on voluntary contributions from contributors.
  • The use of models must be conditioned on respect for the respective creators. The convenience brought comes from their efforts.

Welcome to share here

  • Have a pretrained/finetuned result: model checkpoint (pruned best to facilitate inference, i.e. leave only ema_model_state_dict) and corresponding vocab file (for tokenization).
  • Host a public huggingface model repository and upload the model related files.
  • Make a pull request adding a model card to the current page, i.e. src\f5_tts\infer\SHARED.md.

Supported Languages

Multilingual

F5-TTS Base @ zh & en @ F5-TTS

Model 🤗Hugging Face Data (Hours) Model License
F5-TTS Base ckpt & vocab Emilia 95K zh&en cc-by-nc-4.0
Model: hf://SWivid/F5-TTS/F5TTS_Base/model_1200000.safetensors
Vocab: hf://SWivid/F5-TTS/F5TTS_Base/vocab.txt
Config: {"dim": 1024, "depth": 22, "heads": 16, "ff_mult": 2, "text_dim": 512, "conv_layers": 4}

Other infos, e.g. Author info, Github repo, Link to some sampled results, Usage instruction, Tutorial (Blog, Video, etc.) ...

English

Finnish

F5-TTS Base @ fi @ AsmoKoskinen

Model 🤗Hugging Face Data Model License
F5-TTS Base ckpt & vocab Common Voice, Vox Populi cc-by-nc-4.0
Model: hf://AsmoKoskinen/F5-TTS_Finnish_Model/model_common_voice_fi_vox_populi_fi_20241206.safetensors
Vocab: hf://AsmoKoskinen/F5-TTS_Finnish_Model/vocab.txt
Config: {"dim": 1024, "depth": 22, "heads": 16, "ff_mult": 2, "text_dim": 512, "conv_layers": 4}

French

F5-TTS Base @ fr @ RASPIAUDIO

Model 🤗Hugging Face Data (Hours) Model License
F5-TTS Base ckpt & vocab LibriVox cc-by-nc-4.0
Model: hf://RASPIAUDIO/F5-French-MixedSpeakers-reduced/model_last_reduced.pt
Vocab: hf://RASPIAUDIO/F5-French-MixedSpeakers-reduced/vocab.txt
Config: {"dim": 1024, "depth": 22, "heads": 16, "ff_mult": 2, "text_dim": 512, "conv_layers": 4}

Hindi

F5-TTS Small @ hi @ SPRINGLab

Model 🤗Hugging Face Data (Hours) Model License
F5-TTS Small ckpt & vocab IndicTTS Hi & IndicVoices-R Hi cc-by-4.0
Model: hf://SPRINGLab/F5-Hindi-24KHz/model_2500000.safetensors
Vocab: hf://SPRINGLab/F5-Hindi-24KHz/vocab.txt
Config: {"dim": 768, "depth": 18, "heads": 12, "ff_mult": 2, "text_dim": 512, "conv_layers": 4}

Italian

F5-TTS Base @ it @ alien79

Model 🤗Hugging Face Data Model License
F5-TTS Base ckpt & vocab ylacombe/cml-tts cc-by-nc-4.0
Model: hf://alien79/F5-TTS-italian/model_159600.safetensors
Vocab: hf://alien79/F5-TTS-italian/vocab.txt
Config: {"dim": 1024, "depth": 22, "heads": 16, "ff_mult": 2, "text_dim": 512, "conv_layers": 4}

Japanese

F5-TTS Base @ ja @ Jmica

Model 🤗Hugging Face Data (Hours) Model License
F5-TTS Base ckpt & vocab Emilia 1.7k JA & Galgame Dataset 5.4k cc-by-nc-4.0
Model: hf://Jmica/F5TTS/JA_8500000/model_8499660.pt
Vocab: hf://Jmica/F5TTS/JA_8500000/vocab_updated.txt
Config: {"dim": 1024, "depth": 22, "heads": 16, "ff_mult": 2, "text_dim": 512, "conv_layers": 4}

Mandarin

Spanish

F5-TTS Base @ es @ jpgallegoar

Model 🤗Hugging Face Data (Hours) Model License
F5-TTS Base ckpt & vocab Voxpopuli & Crowdsourced & TEDx, 218 hours cc0-1.0
  • @jpgallegoar GitHub repo, Jupyter Notebook and Gradio usage for Spanish model.