Fish-Speech: Leveraging Large Language Models for Advanced Multilingual Text-to-Speech Synthesis
Paper • 2411.01156 • Published • 13
This repository is publicly accessible, but you have to accept the conditions to access its files and content.
You agree to not use the model to generate contents that violate DMCA or local laws.
Log in or Sign Up to review the conditions and access this model content.
Fish Speech V1.5 is a leading text-to-speech (TTS) model trained on more than 1 million hours of audio data in multiple languages.
Supported languages:
Please refer to Fish Speech Github for more info.
Demo available at Fish Audio.
If you found this repository useful, please consider citing this work:
@misc{fish-speech-v1.4,
title={Fish-Speech: Leveraging Large Language Models for Advanced Multilingual Text-to-Speech Synthesis},
author={Shijia Liao and Yuxuan Wang and Tianyu Li and Yifan Cheng and Ruoyi Zhang and Rongzhi Zhou and Yijin Xing},
year={2024},
eprint={2411.01156},
archivePrefix={arXiv},
primaryClass={cs.SD},
url={https://arxiv.org/abs/2411.01156},
}
This model is permissively licensed under the CC-BY-NC-SA-4.0 license.