Spaces:
Runtime error
Runtime error
kevinwang676
commited on
Commit
β’
38d07ec
1
Parent(s):
fb894d9
Update README.md
Browse files
README.md
CHANGED
@@ -1,90 +1,13 @@
|
|
1 |
-
|
2 |
-
|
3 |
-
|
4 |
-
|
5 |
-
|
6 |
-
|
7 |
-
|
8 |
-
|
9 |
-
|
10 |
-
|
11 |
-
|
12 |
-
|
13 |
-
|
14 |
-
2. CUDA
|
15 |
-
3. [Pytorch](https://pytorch.org/get-started/previous-versions/#v1131) version 1.13.1 (+cu117)
|
16 |
-
4. Clone this repository
|
17 |
-
5. Install python requirements.
|
18 |
-
```
|
19 |
-
pip install -r requirements.txt
|
20 |
-
```
|
21 |
-
|
22 |
-
~~1. You may need to install espeak first: `apt-get install espeak`~~
|
23 |
-
|
24 |
-
If you want to proceed with those cleaned texts in [filelists](filelists), you need to install espeak.
|
25 |
-
```
|
26 |
-
apt-get install espeak
|
27 |
-
```
|
28 |
-
7. Prepare datasets & configuration
|
29 |
-
|
30 |
-
~~1. ex) Download and extract the LJ Speech dataset, then rename or create a link to the dataset folder: `ln -s /path/to/LJSpeech-1.1/wavs DUMMY1`~~
|
31 |
-
1. wav files (22050Hz Mono, PCM-16)
|
32 |
-
2. Prepare text files. One for training<sup>[(ex)](filelists/ljs_audio_text_train_filelist.txt)</sup> and one for validation<sup>[(ex)](filelists/ljs_audio_text_val_filelist.txt)</sup>.
|
33 |
-
|
34 |
-
- Single speaker<sup>[(ex)](filelists/ljs_audio_text_test_filelist.txt)</sup>
|
35 |
-
|
36 |
-
```
|
37 |
-
wavfile_path|transcript
|
38 |
-
```
|
39 |
-
|
40 |
-
|
41 |
-
- Multi speaker<sup>[(ex)](filelists/vctk_audio_sid_text_test_filelist.txt)</sup>
|
42 |
-
|
43 |
-
```
|
44 |
-
wavfile_path|speaker_id|transcript
|
45 |
-
```
|
46 |
-
4. Run preprocessing with a [cleaner](text/cleaners.py) of your interest. You may change the [symbols](text/symbols.py) as well.
|
47 |
-
- Single speaker
|
48 |
-
```
|
49 |
-
python preprocess.py --text_index 1 --filelists PATH_TO_train.txt --text_cleaners CLEANER_NAME
|
50 |
-
python preprocess.py --text_index 1 --filelists PATH_TO_val.txt --text_cleaners CLEANER_NAME
|
51 |
-
```
|
52 |
-
|
53 |
-
- Multi speaker
|
54 |
-
```
|
55 |
-
python preprocess.py --text_index 2 --filelists PATH_TO_train.txt --text_cleaners CLEANER_NAME
|
56 |
-
python preprocess.py --text_index 2 --filelists PATH_TO_val.txt --text_cleaners CLEANER_NAME
|
57 |
-
```
|
58 |
-
The resulting cleaned text would be like [this(single)](filelists/ljs_audio_text_test_filelist.txt.cleaned). <sup>[ex - multi](filelists/vctk_audio_sid_text_test_filelist.txt.cleaned)</sup>
|
59 |
-
|
60 |
-
9. Build Monotonic Alignment Search.
|
61 |
-
```sh
|
62 |
-
# Cython-version Monotonoic Alignment Search
|
63 |
-
cd monotonic_align
|
64 |
-
mkdir monotonic_align
|
65 |
-
python setup.py build_ext --inplace
|
66 |
-
```
|
67 |
-
8. Edit [configurations](configs) based on files and cleaners you used.
|
68 |
-
|
69 |
-
## Setting json file in [configs](configs)
|
70 |
-
| Model | How to set up json file in [configs](configs) | Sample of json file configuration|
|
71 |
-
| :---: | :---: | :---: |
|
72 |
-
| iSTFT-VITS2 | ```"istft_vits": true, ```<br>``` "upsample_rates": [8,8], ``` | istft_vits2_base.json |
|
73 |
-
| MB-iSTFT-VITS2 | ```"subbands": 4,```<br>```"mb_istft_vits": true, ```<br>``` "upsample_rates": [4,4], ``` | mb_istft_vits2_base.json |
|
74 |
-
| MS-iSTFT-VITS2 | ```"subbands": 4,```<br>```"ms_istft_vits": true, ```<br>``` "upsample_rates": [4,4], ``` | ms_istft_vits2_base.json |
|
75 |
-
| Mini-iSTFT-VITS2 | ```"istft_vits": true, ```<br>``` "upsample_rates": [8,8], ```<br>```"hidden_channels": 96, ```<br>```"n_layers": 3,``` | mini_istft_vits2_base.json |
|
76 |
-
| Mini-MB-iSTFT-VITS2 | ```"subbands": 4,```<br>```"mb_istft_vits": true, ```<br>``` "upsample_rates": [4,4], ```<br>```"hidden_channels": 96, ```<br>```"n_layers": 3,```<br>```"upsample_initial_channel": 256,``` | mini_mb_istft_vits2_base.json |
|
77 |
-
|
78 |
-
## Training Example
|
79 |
-
```sh
|
80 |
-
# train_ms.py for multi speaker
|
81 |
-
python train.py -c configs/mb_istft_vits2_base.json -m models/test
|
82 |
-
```
|
83 |
-
|
84 |
-
## Credits
|
85 |
-
- [jaywalnut310/vits](https://github.com/jaywalnut310/vits)
|
86 |
-
- [p0p4k/vits2_pytorch](https://github.com/p0p4k/vits2_pytorch)
|
87 |
-
- [MasayaKawamura/MB-iSTFT-VITS](https://github.com/MasayaKawamura/MB-iSTFT-VITS)
|
88 |
-
- [ORI-Muchim/PolyLangVITS](https://github.com/ORI-Muchim/PolyLangVITS)
|
89 |
-
- [tonnetonne814/MB-iSTFT-VITS-44100-Ja](https://github.com/tonnetonne814/MB-iSTFT-VITS-44100-Ja)
|
90 |
-
- [misakiudon/MB-iSTFT-VITS-multilingual](https://github.com/misakiudon/MB-iSTFT-VITS-multilingual)
|
|
|
1 |
+
---
|
2 |
+
title: VITS2 Chinese
|
3 |
+
emoji: π
|
4 |
+
colorFrom: blue
|
5 |
+
colorTo: pink
|
6 |
+
sdk: gradio
|
7 |
+
sdk_version: 3.36.1
|
8 |
+
app_file: app.py
|
9 |
+
pinned: false
|
10 |
+
license: mit
|
11 |
+
---
|
12 |
+
|
13 |
+
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|