Nông Văn Thắng
commited on
Commit
•
083db9c
1
Parent(s):
ed62070
main
Browse files- .gitattributes +35 -0
- README.md +13 -88
.gitattributes
ADDED
@@ -0,0 +1,35 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
*.7z filter=lfs diff=lfs merge=lfs -text
|
2 |
+
*.arrow filter=lfs diff=lfs merge=lfs -text
|
3 |
+
*.bin filter=lfs diff=lfs merge=lfs -text
|
4 |
+
*.bz2 filter=lfs diff=lfs merge=lfs -text
|
5 |
+
*.ckpt filter=lfs diff=lfs merge=lfs -text
|
6 |
+
*.ftz filter=lfs diff=lfs merge=lfs -text
|
7 |
+
*.gz filter=lfs diff=lfs merge=lfs -text
|
8 |
+
*.h5 filter=lfs diff=lfs merge=lfs -text
|
9 |
+
*.joblib filter=lfs diff=lfs merge=lfs -text
|
10 |
+
*.lfs.* filter=lfs diff=lfs merge=lfs -text
|
11 |
+
*.mlmodel filter=lfs diff=lfs merge=lfs -text
|
12 |
+
*.model filter=lfs diff=lfs merge=lfs -text
|
13 |
+
*.msgpack filter=lfs diff=lfs merge=lfs -text
|
14 |
+
*.npy filter=lfs diff=lfs merge=lfs -text
|
15 |
+
*.npz filter=lfs diff=lfs merge=lfs -text
|
16 |
+
*.onnx filter=lfs diff=lfs merge=lfs -text
|
17 |
+
*.ot filter=lfs diff=lfs merge=lfs -text
|
18 |
+
*.parquet filter=lfs diff=lfs merge=lfs -text
|
19 |
+
*.pb filter=lfs diff=lfs merge=lfs -text
|
20 |
+
*.pickle filter=lfs diff=lfs merge=lfs -text
|
21 |
+
*.pkl filter=lfs diff=lfs merge=lfs -text
|
22 |
+
*.pt filter=lfs diff=lfs merge=lfs -text
|
23 |
+
*.pth filter=lfs diff=lfs merge=lfs -text
|
24 |
+
*.rar filter=lfs diff=lfs merge=lfs -text
|
25 |
+
*.safetensors filter=lfs diff=lfs merge=lfs -text
|
26 |
+
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
27 |
+
*.tar.* filter=lfs diff=lfs merge=lfs -text
|
28 |
+
*.tar filter=lfs diff=lfs merge=lfs -text
|
29 |
+
*.tflite filter=lfs diff=lfs merge=lfs -text
|
30 |
+
*.tgz filter=lfs diff=lfs merge=lfs -text
|
31 |
+
*.wasm filter=lfs diff=lfs merge=lfs -text
|
32 |
+
*.xz filter=lfs diff=lfs merge=lfs -text
|
33 |
+
*.zip filter=lfs diff=lfs merge=lfs -text
|
34 |
+
*.zst filter=lfs diff=lfs merge=lfs -text
|
35 |
+
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
README.md
CHANGED
@@ -1,88 +1,13 @@
|
|
1 |
-
|
2 |
-
|
3 |
-
|
4 |
-
|
5 |
-
|
6 |
-
|
7 |
-
|
8 |
-
|
9 |
-
|
10 |
-
|
11 |
-
|
12 |
-
|
13 |
-
|
14 |
-
|
15 |
-
- You can try the model here: <https://huggingface.co/spaces/thinhlpg/vixtts-demo>
|
16 |
-
- For a quick demonstration, please refer to [this notebook](./viXTTS_Demo.ipynb) on Google Colab.
|
17 |
-
Tutorial (Vietnamese): <https://youtu.be/pbwEbpOy0m8?feature=shared>
|
18 |
-
![viXTTS Colab Demo](assets/vixtts_colab.png)
|
19 |
-
|
20 |
-
## Local Usage
|
21 |
-
|
22 |
-
This code is specifically designed for running on Ubuntu or WSL2. It is not intended for use on macOS or Windows systems.
|
23 |
-
![viXTTS Gradio Demo](assets/vixtts_gradio_ui.png)
|
24 |
-
|
25 |
-
### Hardware Recommendations
|
26 |
-
|
27 |
-
- At least 10GB of free disk space
|
28 |
-
- At least 16GB of RAM
|
29 |
-
- **Nvidia GPU** with a minimum of 4GB of VRAM
|
30 |
-
- By default, the model will utilize the GPU. In the absence of a GPU, it will run on the CPU and run much slower.
|
31 |
-
|
32 |
-
### Required Software
|
33 |
-
|
34 |
-
- Git
|
35 |
-
- Python version >=3.9 and <= 3.11. The default version is set to 3.11, but you can modify the Python version in the `run.sh` file.
|
36 |
-
|
37 |
-
### Usage
|
38 |
-
|
39 |
-
```bash
|
40 |
-
git clone https://github.com/thinhlpg/vixtts-demo
|
41 |
-
cd vixtts-demo
|
42 |
-
./run.sh
|
43 |
-
```
|
44 |
-
|
45 |
-
1. Run `run.sh` (dependencies will be automatically installed for the first run).
|
46 |
-
2. Access the Gradio demo link.
|
47 |
-
3. Load the model and wait for it to load.
|
48 |
-
4. Inference and Enjoy 🤗
|
49 |
-
5. The result will be saved in `output/`
|
50 |
-
|
51 |
-
## Limitation
|
52 |
-
|
53 |
-
- Subpar performance for input sentences under 10 words in Vietnamese language (yielding inconsistent output and odd trailing sounds).
|
54 |
-
- This model is only fine-tuned in Vietnamese. The model's effectiveness with languages other than Vietnamese hasn't been tested, potentially reducing quality.
|
55 |
-
|
56 |
-
## Contributions
|
57 |
-
|
58 |
-
This project is not being actively maintained, and I do not plan to release the finetuning code due to sensitive reasons, as it might be used for unethical purposes. If you want to contribute by creating versions for other operating systems, such as Windows or macOS, please fork the repository, create a new branch, test thoroughly on the respective OS, and submit a pull request specifying your contributions.
|
59 |
-
|
60 |
-
## Acknowledgements
|
61 |
-
|
62 |
-
We would like to express our gratitude to all libraries, and resources that have played a role in the development of this demo, especially:
|
63 |
-
|
64 |
-
- [Coqui TTS](https://github.com/coqui-ai/TTS) for XTTS foundation model and inference code
|
65 |
-
- [Vinorm](https://github.com/v-nhandt21/Vinorm) and [Undethesea](https://github.com/undertheseanlp/underthesea) for Vietnamese text normalization
|
66 |
-
- [Deepspeed](https://github.com/microsoft/DeepSpeed) for fast inference
|
67 |
-
- [Huggingface Hub](https://huggingface.co/) for hosting the model
|
68 |
-
- [Gradio](https://www.gradio.app/) for web UI
|
69 |
-
- [DeepFilterNet](https://github.com/Rikorose/DeepFilterNet) for noise removal
|
70 |
-
|
71 |
-
## Citation
|
72 |
-
|
73 |
-
```bibtex
|
74 |
-
@misc{viVoice,
|
75 |
-
author = {Thinh Le Phuoc Gia, Tuan Pham Minh, Hung Nguyen Quoc, Trung Nguyen Quoc, Vinh Truong Hoang},
|
76 |
-
title = {viVoice: Enabling Vietnamese Multi-Speaker Speech Synthesis},
|
77 |
-
url = {https://github.com/thinhlpg/viVoice},
|
78 |
-
year = {2024}
|
79 |
-
}
|
80 |
-
```
|
81 |
-
|
82 |
-
A manuscript and a friendly dev log documenting the process might be made available later (including other works that were experimented with, but details about the filtering process are not specified in this README file).
|
83 |
-
|
84 |
-
## Contact 💬
|
85 |
-
|
86 |
-
- Facebook: <https://fb.com/thinhlpg/> (preferred; feel free to add friend and message me casually)
|
87 |
-
- GitHub: <https://github.com/thinhlpg>
|
88 |
-
- Email: <thinhlpg@gmail.com> (please don't; I prefer friendly, casual talk 💀)
|
|
|
1 |
+
---
|
2 |
+
title: Tts Vie
|
3 |
+
emoji: 🚀
|
4 |
+
colorFrom: indigo
|
5 |
+
colorTo: indigo
|
6 |
+
sdk: gradio
|
7 |
+
sdk_version: 5.0.2
|
8 |
+
app_file: app.py
|
9 |
+
pinned: false
|
10 |
+
short_description: text-to-speech
|
11 |
+
---
|
12 |
+
|
13 |
+
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|