--- datasets: - facebook/multilingual_librispeech language: - it base_model: - SWivid/F5-TTS pipeline_tag: text-to-speech license: cc-by-4.0 --- This is a test to see how to finetune F5 in italian Trained over 9h split of facebook/multilingual_librispeech dataset for 200 Epoch: - catastrophic failure (the model forgot english) - lost ability to clone voice properly - italian pronunciation not yet good enough The last produced file, the one to test, is model_italian_200e_9h.safetensors The run.py file is an example of how to extract the wav files and produce the metadata.csv to use for training