Winnie Chang
commited on
Commit
•
af91e0f
1
Parent(s):
f2300f2
Update model
Browse files- README.md +581 -0
- dump/22k/xvector/dev/spk_xvector.ark +0 -0
- dump/22k/xvector/dev/spk_xvector.scp +141 -0
- dump/22k/xvector/test/spk_xvector.ark +0 -0
- dump/22k/xvector/test/spk_xvector.scp +214 -0
- dump/22k/xvector/train_no_dev/spk_xvector.ark +0 -0
- dump/22k/xvector/train_no_dev/spk_xvector.scp +174 -0
- exp/22k/tts_train_raw_phn_pypinyin_g2p_phone/600epoch.pth +3 -0
- exp/22k/tts_train_raw_phn_pypinyin_g2p_phone/config.yaml +500 -0
- exp/22k/tts_train_raw_phn_pypinyin_g2p_phone/images/discriminator_backward_time.png +0 -0
- exp/22k/tts_train_raw_phn_pypinyin_g2p_phone/images/discriminator_fake_loss.png +0 -0
- exp/22k/tts_train_raw_phn_pypinyin_g2p_phone/images/discriminator_forward_time.png +0 -0
- exp/22k/tts_train_raw_phn_pypinyin_g2p_phone/images/discriminator_loss.png +0 -0
- exp/22k/tts_train_raw_phn_pypinyin_g2p_phone/images/discriminator_optim_step_time.png +0 -0
- exp/22k/tts_train_raw_phn_pypinyin_g2p_phone/images/discriminator_real_loss.png +0 -0
- exp/22k/tts_train_raw_phn_pypinyin_g2p_phone/images/discriminator_train_time.png +0 -0
- exp/22k/tts_train_raw_phn_pypinyin_g2p_phone/images/generator_adv_loss.png +0 -0
- exp/22k/tts_train_raw_phn_pypinyin_g2p_phone/images/generator_backward_time.png +0 -0
- exp/22k/tts_train_raw_phn_pypinyin_g2p_phone/images/generator_dur_loss.png +0 -0
- exp/22k/tts_train_raw_phn_pypinyin_g2p_phone/images/generator_feat_match_loss.png +0 -0
- exp/22k/tts_train_raw_phn_pypinyin_g2p_phone/images/generator_forward_time.png +0 -0
- exp/22k/tts_train_raw_phn_pypinyin_g2p_phone/images/generator_kl_loss.png +0 -0
- exp/22k/tts_train_raw_phn_pypinyin_g2p_phone/images/generator_loss.png +0 -0
- exp/22k/tts_train_raw_phn_pypinyin_g2p_phone/images/generator_mel_loss.png +0 -0
- exp/22k/tts_train_raw_phn_pypinyin_g2p_phone/images/generator_optim_step_time.png +0 -0
- exp/22k/tts_train_raw_phn_pypinyin_g2p_phone/images/generator_train_time.png +0 -0
- exp/22k/tts_train_raw_phn_pypinyin_g2p_phone/images/gpu_max_cached_mem_GB.png +0 -0
- exp/22k/tts_train_raw_phn_pypinyin_g2p_phone/images/iter_time.png +0 -0
- exp/22k/tts_train_raw_phn_pypinyin_g2p_phone/images/optim0_lr0.png +0 -0
- exp/22k/tts_train_raw_phn_pypinyin_g2p_phone/images/optim1_lr0.png +0 -0
- exp/22k/tts_train_raw_phn_pypinyin_g2p_phone/images/train_time.png +0 -0
- meta.yaml +8 -0
README.md
ADDED
@@ -0,0 +1,581 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
tags:
|
3 |
+
- espnet
|
4 |
+
- audio
|
5 |
+
- text-to-speech
|
6 |
+
language: zh
|
7 |
+
datasets:
|
8 |
+
- aishell3
|
9 |
+
license: cc-by-4.0
|
10 |
+
---
|
11 |
+
|
12 |
+
## ESPnet2 TTS model
|
13 |
+
|
14 |
+
### `winniech/espnet2_pretrained_aishell3_vits_tts_train_raw_phn_pypinyin_g2p_phone`
|
15 |
+
|
16 |
+
This model was trained by winniech using aishell3 recipe in [espnet](https://github.com/espnet/espnet/).
|
17 |
+
|
18 |
+
### Demo: How to use in ESPnet2
|
19 |
+
|
20 |
+
Follow the [ESPnet installation instructions](https://espnet.github.io/espnet/installation.html)
|
21 |
+
if you haven't done that already.
|
22 |
+
|
23 |
+
```bash
|
24 |
+
cd espnet
|
25 |
+
git checkout b80a57dd08db255750fa5518eaa86925c5fb6b87
|
26 |
+
pip install -e .
|
27 |
+
cd egs2/aishell3/tts1
|
28 |
+
./run.sh --skip_data_prep false --skip_train true --download_model winniech/espnet2_pretrained_aishell3_vits_tts_train_raw_phn_pypinyin_g2p_phone
|
29 |
+
```
|
30 |
+
|
31 |
+
|
32 |
+
|
33 |
+
## TTS config
|
34 |
+
|
35 |
+
<details><summary>expand</summary>
|
36 |
+
|
37 |
+
```
|
38 |
+
config: conf/train.yaml
|
39 |
+
print_config: false
|
40 |
+
log_level: INFO
|
41 |
+
dry_run: false
|
42 |
+
iterator_type: sequence
|
43 |
+
output_dir: exp/22k/tts_train_raw_phn_pypinyin_g2p_phone
|
44 |
+
ngpu: 1
|
45 |
+
seed: 777
|
46 |
+
num_workers: 4
|
47 |
+
num_att_plot: 3
|
48 |
+
dist_backend: nccl
|
49 |
+
dist_init_method: env://
|
50 |
+
dist_world_size: null
|
51 |
+
dist_rank: null
|
52 |
+
local_rank: 0
|
53 |
+
dist_master_addr: null
|
54 |
+
dist_master_port: null
|
55 |
+
dist_launcher: null
|
56 |
+
multiprocessing_distributed: false
|
57 |
+
unused_parameters: true
|
58 |
+
sharded_ddp: false
|
59 |
+
cudnn_enabled: true
|
60 |
+
cudnn_benchmark: false
|
61 |
+
cudnn_deterministic: false
|
62 |
+
collect_stats: false
|
63 |
+
write_collected_feats: false
|
64 |
+
max_epoch: 1000
|
65 |
+
patience: null
|
66 |
+
val_scheduler_criterion:
|
67 |
+
- valid
|
68 |
+
- loss
|
69 |
+
early_stopping_criterion:
|
70 |
+
- valid
|
71 |
+
- loss
|
72 |
+
- min
|
73 |
+
best_model_criterion:
|
74 |
+
- - train
|
75 |
+
- total_count
|
76 |
+
- max
|
77 |
+
keep_nbest_models: 10
|
78 |
+
nbest_averaging_interval: 0
|
79 |
+
grad_clip: -1
|
80 |
+
grad_clip_type: 2.0
|
81 |
+
grad_noise: false
|
82 |
+
accum_grad: 1
|
83 |
+
no_forward_run: false
|
84 |
+
resume: true
|
85 |
+
train_dtype: float32
|
86 |
+
use_amp: false
|
87 |
+
log_interval: 50
|
88 |
+
use_matplotlib: true
|
89 |
+
use_tensorboard: true
|
90 |
+
create_graph_in_tensorboard: false
|
91 |
+
use_wandb: false
|
92 |
+
wandb_project: null
|
93 |
+
wandb_id: null
|
94 |
+
wandb_entity: null
|
95 |
+
wandb_name: null
|
96 |
+
wandb_model_log_interval: -1
|
97 |
+
detect_anomaly: false
|
98 |
+
pretrain_path: null
|
99 |
+
init_param: []
|
100 |
+
ignore_init_mismatch: false
|
101 |
+
freeze_param: []
|
102 |
+
num_iters_per_epoch: 1000
|
103 |
+
batch_size: 20
|
104 |
+
valid_batch_size: null
|
105 |
+
batch_bins: 1250000
|
106 |
+
valid_batch_bins: null
|
107 |
+
train_shape_file:
|
108 |
+
- exp/22k/tts_stats_raw_linear_spectrogram_phn_pypinyin_g2p_phone/train/text_shape.phn
|
109 |
+
- exp/22k/tts_stats_raw_linear_spectrogram_phn_pypinyin_g2p_phone/train/speech_shape
|
110 |
+
valid_shape_file:
|
111 |
+
- exp/22k/tts_stats_raw_linear_spectrogram_phn_pypinyin_g2p_phone/valid/text_shape.phn
|
112 |
+
- exp/22k/tts_stats_raw_linear_spectrogram_phn_pypinyin_g2p_phone/valid/speech_shape
|
113 |
+
batch_type: numel
|
114 |
+
valid_batch_type: null
|
115 |
+
fold_length:
|
116 |
+
- 150
|
117 |
+
- 204800
|
118 |
+
sort_in_batch: descending
|
119 |
+
sort_batch: descending
|
120 |
+
multiple_iterator: false
|
121 |
+
chunk_length: 500
|
122 |
+
chunk_shift_ratio: 0.5
|
123 |
+
num_cache_chunks: 1024
|
124 |
+
chunk_excluded_key_prefixes: []
|
125 |
+
train_data_path_and_name_and_type:
|
126 |
+
- - dump/22k/raw/train_no_dev/text
|
127 |
+
- text
|
128 |
+
- text
|
129 |
+
- - dump/22k/raw/train_no_dev/wav.scp
|
130 |
+
- speech
|
131 |
+
- sound
|
132 |
+
- - dump/22k/xvector/train_no_dev/xvector.scp
|
133 |
+
- spembs
|
134 |
+
- kaldi_ark
|
135 |
+
valid_data_path_and_name_and_type:
|
136 |
+
- - dump/22k/raw/dev/text
|
137 |
+
- text
|
138 |
+
- text
|
139 |
+
- - dump/22k/raw/dev/wav.scp
|
140 |
+
- speech
|
141 |
+
- sound
|
142 |
+
- - dump/22k/xvector/dev/xvector.scp
|
143 |
+
- spembs
|
144 |
+
- kaldi_ark
|
145 |
+
allow_variable_data_keys: false
|
146 |
+
max_cache_size: 0.0
|
147 |
+
max_cache_fd: 32
|
148 |
+
valid_max_cache_size: null
|
149 |
+
exclude_weight_decay: false
|
150 |
+
exclude_weight_decay_conf: {}
|
151 |
+
optim: adamw
|
152 |
+
optim_conf:
|
153 |
+
lr: 0.0002
|
154 |
+
betas:
|
155 |
+
- 0.8
|
156 |
+
- 0.99
|
157 |
+
eps: 1.0e-09
|
158 |
+
weight_decay: 0.0
|
159 |
+
scheduler: exponentiallr
|
160 |
+
scheduler_conf:
|
161 |
+
gamma: 0.999875
|
162 |
+
optim2: adamw
|
163 |
+
optim2_conf:
|
164 |
+
lr: 0.0002
|
165 |
+
betas:
|
166 |
+
- 0.8
|
167 |
+
- 0.99
|
168 |
+
eps: 1.0e-09
|
169 |
+
weight_decay: 0.0
|
170 |
+
scheduler2: exponentiallr
|
171 |
+
scheduler2_conf:
|
172 |
+
gamma: 0.999875
|
173 |
+
generator_first: false
|
174 |
+
token_list:
|
175 |
+
- <blank>
|
176 |
+
- <unk>
|
177 |
+
- d
|
178 |
+
- sh
|
179 |
+
- j
|
180 |
+
- i4
|
181 |
+
- zh
|
182 |
+
- l
|
183 |
+
- x
|
184 |
+
- e
|
185 |
+
- b
|
186 |
+
- g
|
187 |
+
- i1
|
188 |
+
- h
|
189 |
+
- q
|
190 |
+
- m
|
191 |
+
- t
|
192 |
+
- i2
|
193 |
+
- u4
|
194 |
+
- z
|
195 |
+
- ch
|
196 |
+
- i3
|
197 |
+
- f
|
198 |
+
- s
|
199 |
+
- n
|
200 |
+
- iou3
|
201 |
+
- r
|
202 |
+
- ian4
|
203 |
+
- ong1
|
204 |
+
- uei4
|
205 |
+
- e4
|
206 |
+
- en2
|
207 |
+
- ai4
|
208 |
+
- k
|
209 |
+
- ing2
|
210 |
+
- a1
|
211 |
+
- uo3
|
212 |
+
- u3
|
213 |
+
- ao4
|
214 |
+
- p
|
215 |
+
- an1
|
216 |
+
- eng2
|
217 |
+
- e2
|
218 |
+
- in1
|
219 |
+
- c
|
220 |
+
- ai2
|
221 |
+
- an4
|
222 |
+
- ian2
|
223 |
+
- u2
|
224 |
+
- ang4
|
225 |
+
- ian1
|
226 |
+
- ai3
|
227 |
+
- ing1
|
228 |
+
- ao3
|
229 |
+
- uo4
|
230 |
+
- ian3
|
231 |
+
- ing4
|
232 |
+
- ü4
|
233 |
+
- ang1
|
234 |
+
- u1
|
235 |
+
- iao4
|
236 |
+
- eng1
|
237 |
+
- iou4
|
238 |
+
- a4
|
239 |
+
- üan2
|
240 |
+
- ie4
|
241 |
+
- ou4
|
242 |
+
- er4
|
243 |
+
- en1
|
244 |
+
- ong2
|
245 |
+
- e1
|
246 |
+
- an3
|
247 |
+
- ei4
|
248 |
+
- uo2
|
249 |
+
- ou3
|
250 |
+
- ang2
|
251 |
+
- iang4
|
252 |
+
- ou1
|
253 |
+
- ang3
|
254 |
+
- an2
|
255 |
+
- eng4
|
256 |
+
- ong4
|
257 |
+
- uan4
|
258 |
+
- a3
|
259 |
+
- ia4
|
260 |
+
- ia1
|
261 |
+
- iao1
|
262 |
+
- iang1
|
263 |
+
- iou2
|
264 |
+
- uo1
|
265 |
+
- ei3
|
266 |
+
- iao3
|
267 |
+
- in4
|
268 |
+
- e3
|
269 |
+
- ü3
|
270 |
+
- iang3
|
271 |
+
- uei2
|
272 |
+
- en3
|
273 |
+
- uan1
|
274 |
+
- ie3
|
275 |
+
- ao1
|
276 |
+
- ai1
|
277 |
+
- üe4
|
278 |
+
- ü2
|
279 |
+
- ing3
|
280 |
+
- en4
|
281 |
+
- uei1
|
282 |
+
- er2
|
283 |
+
- uan3
|
284 |
+
- ü1
|
285 |
+
- in3
|
286 |
+
- en
|
287 |
+
- üe2
|
288 |
+
- ie2
|
289 |
+
- ei2
|
290 |
+
- ua4
|
291 |
+
- uan2
|
292 |
+
- in2
|
293 |
+
- a2
|
294 |
+
- ie1
|
295 |
+
- iang2
|
296 |
+
- ou2
|
297 |
+
- ong3
|
298 |
+
- uang3
|
299 |
+
- eng3
|
300 |
+
- uen1
|
301 |
+
- uai4
|
302 |
+
- ün4
|
303 |
+
- uang4
|
304 |
+
- uei3
|
305 |
+
- uen2
|
306 |
+
- uen4
|
307 |
+
- i
|
308 |
+
- iong4
|
309 |
+
- v3
|
310 |
+
- iao2
|
311 |
+
- üan4
|
312 |
+
- uang1
|
313 |
+
- ei1
|
314 |
+
- o2
|
315 |
+
- iou1
|
316 |
+
- uang2
|
317 |
+
- a
|
318 |
+
- ao2
|
319 |
+
- o1
|
320 |
+
- ua2
|
321 |
+
- uen3
|
322 |
+
- ua1
|
323 |
+
- v4
|
324 |
+
- üan3
|
325 |
+
- ün1
|
326 |
+
- üe1
|
327 |
+
- ün2
|
328 |
+
- o4
|
329 |
+
- er3
|
330 |
+
- iong3
|
331 |
+
- üan1
|
332 |
+
- ia3
|
333 |
+
- ia2
|
334 |
+
- iong1
|
335 |
+
- üe3
|
336 |
+
- ve4
|
337 |
+
- iong2
|
338 |
+
- uai2
|
339 |
+
- er
|
340 |
+
- ua3
|
341 |
+
- uai1
|
342 |
+
- ou
|
343 |
+
- ün3
|
344 |
+
- uai3
|
345 |
+
- ia
|
346 |
+
- uo
|
347 |
+
- o3
|
348 |
+
- v2
|
349 |
+
- ueng1
|
350 |
+
- o
|
351 |
+
- ei
|
352 |
+
- ua
|
353 |
+
- io1
|
354 |
+
- <sos/eos>
|
355 |
+
odim: null
|
356 |
+
model_conf: {}
|
357 |
+
use_preprocessor: true
|
358 |
+
token_type: phn
|
359 |
+
bpemodel: null
|
360 |
+
non_linguistic_symbols: null
|
361 |
+
cleaner: null
|
362 |
+
g2p: pypinyin_g2p_phone
|
363 |
+
feats_extract: linear_spectrogram
|
364 |
+
feats_extract_conf:
|
365 |
+
n_fft: 1024
|
366 |
+
hop_length: 256
|
367 |
+
win_length: null
|
368 |
+
normalize: null
|
369 |
+
normalize_conf: {}
|
370 |
+
tts: vits
|
371 |
+
tts_conf:
|
372 |
+
generator_type: vits_generator
|
373 |
+
generator_params:
|
374 |
+
hidden_channels: 192
|
375 |
+
spks: -1
|
376 |
+
spk_embed_dim: 512
|
377 |
+
global_channels: 256
|
378 |
+
segment_size: 32
|
379 |
+
text_encoder_attention_heads: 2
|
380 |
+
text_encoder_ffn_expand: 4
|
381 |
+
text_encoder_blocks: 6
|
382 |
+
text_encoder_positionwise_layer_type: conv1d
|
383 |
+
text_encoder_positionwise_conv_kernel_size: 3
|
384 |
+
text_encoder_positional_encoding_layer_type: rel_pos
|
385 |
+
text_encoder_self_attention_layer_type: rel_selfattn
|
386 |
+
text_encoder_activation_type: swish
|
387 |
+
text_encoder_normalize_before: true
|
388 |
+
text_encoder_dropout_rate: 0.1
|
389 |
+
text_encoder_positional_dropout_rate: 0.0
|
390 |
+
text_encoder_attention_dropout_rate: 0.1
|
391 |
+
use_macaron_style_in_text_encoder: true
|
392 |
+
use_conformer_conv_in_text_encoder: false
|
393 |
+
text_encoder_conformer_kernel_size: -1
|
394 |
+
decoder_kernel_size: 7
|
395 |
+
decoder_channels: 512
|
396 |
+
decoder_upsample_scales:
|
397 |
+
- 8
|
398 |
+
- 8
|
399 |
+
- 2
|
400 |
+
- 2
|
401 |
+
decoder_upsample_kernel_sizes:
|
402 |
+
- 16
|
403 |
+
- 16
|
404 |
+
- 4
|
405 |
+
- 4
|
406 |
+
decoder_resblock_kernel_sizes:
|
407 |
+
- 3
|
408 |
+
- 7
|
409 |
+
- 11
|
410 |
+
decoder_resblock_dilations:
|
411 |
+
- - 1
|
412 |
+
- 3
|
413 |
+
- 5
|
414 |
+
- - 1
|
415 |
+
- 3
|
416 |
+
- 5
|
417 |
+
- - 1
|
418 |
+
- 3
|
419 |
+
- 5
|
420 |
+
use_weight_norm_in_decoder: true
|
421 |
+
posterior_encoder_kernel_size: 5
|
422 |
+
posterior_encoder_layers: 16
|
423 |
+
posterior_encoder_stacks: 1
|
424 |
+
posterior_encoder_base_dilation: 1
|
425 |
+
posterior_encoder_dropout_rate: 0.0
|
426 |
+
use_weight_norm_in_posterior_encoder: true
|
427 |
+
flow_flows: 4
|
428 |
+
flow_kernel_size: 5
|
429 |
+
flow_base_dilation: 1
|
430 |
+
flow_layers: 4
|
431 |
+
flow_dropout_rate: 0.0
|
432 |
+
use_weight_norm_in_flow: true
|
433 |
+
use_only_mean_in_flow: true
|
434 |
+
stochastic_duration_predictor_kernel_size: 3
|
435 |
+
stochastic_duration_predictor_dropout_rate: 0.5
|
436 |
+
stochastic_duration_predictor_flows: 4
|
437 |
+
stochastic_duration_predictor_dds_conv_layers: 3
|
438 |
+
vocabs: 180
|
439 |
+
aux_channels: 513
|
440 |
+
discriminator_type: hifigan_multi_scale_multi_period_discriminator
|
441 |
+
discriminator_params:
|
442 |
+
scales: 1
|
443 |
+
scale_downsample_pooling: AvgPool1d
|
444 |
+
scale_downsample_pooling_params:
|
445 |
+
kernel_size: 4
|
446 |
+
stride: 2
|
447 |
+
padding: 2
|
448 |
+
scale_discriminator_params:
|
449 |
+
in_channels: 1
|
450 |
+
out_channels: 1
|
451 |
+
kernel_sizes:
|
452 |
+
- 15
|
453 |
+
- 41
|
454 |
+
- 5
|
455 |
+
- 3
|
456 |
+
channels: 128
|
457 |
+
max_downsample_channels: 1024
|
458 |
+
max_groups: 16
|
459 |
+
bias: true
|
460 |
+
downsample_scales:
|
461 |
+
- 2
|
462 |
+
- 2
|
463 |
+
- 4
|
464 |
+
- 4
|
465 |
+
- 1
|
466 |
+
nonlinear_activation: LeakyReLU
|
467 |
+
nonlinear_activation_params:
|
468 |
+
negative_slope: 0.1
|
469 |
+
use_weight_norm: true
|
470 |
+
use_spectral_norm: false
|
471 |
+
follow_official_norm: false
|
472 |
+
periods:
|
473 |
+
- 2
|
474 |
+
- 3
|
475 |
+
- 5
|
476 |
+
- 7
|
477 |
+
- 11
|
478 |
+
period_discriminator_params:
|
479 |
+
in_channels: 1
|
480 |
+
out_channels: 1
|
481 |
+
kernel_sizes:
|
482 |
+
- 5
|
483 |
+
- 3
|
484 |
+
channels: 32
|
485 |
+
downsample_scales:
|
486 |
+
- 3
|
487 |
+
- 3
|
488 |
+
- 3
|
489 |
+
- 3
|
490 |
+
- 1
|
491 |
+
max_downsample_channels: 1024
|
492 |
+
bias: true
|
493 |
+
nonlinear_activation: LeakyReLU
|
494 |
+
nonlinear_activation_params:
|
495 |
+
negative_slope: 0.1
|
496 |
+
use_weight_norm: true
|
497 |
+
use_spectral_norm: false
|
498 |
+
generator_adv_loss_params:
|
499 |
+
average_by_discriminators: false
|
500 |
+
loss_type: mse
|
501 |
+
discriminator_adv_loss_params:
|
502 |
+
average_by_discriminators: false
|
503 |
+
loss_type: mse
|
504 |
+
feat_match_loss_params:
|
505 |
+
average_by_discriminators: false
|
506 |
+
average_by_layers: false
|
507 |
+
include_final_outputs: true
|
508 |
+
mel_loss_params:
|
509 |
+
fs: 22050
|
510 |
+
n_fft: 1024
|
511 |
+
hop_length: 256
|
512 |
+
win_length: null
|
513 |
+
window: hann
|
514 |
+
n_mels: 80
|
515 |
+
fmin: 0
|
516 |
+
fmax: null
|
517 |
+
log_base: null
|
518 |
+
lambda_adv: 1.0
|
519 |
+
lambda_mel: 45.0
|
520 |
+
lambda_feat_match: 2.0
|
521 |
+
lambda_dur: 1.0
|
522 |
+
lambda_kl: 1.0
|
523 |
+
sampling_rate: 22050
|
524 |
+
cache_generator_outputs: true
|
525 |
+
pitch_extract: null
|
526 |
+
pitch_extract_conf: {}
|
527 |
+
pitch_normalize: null
|
528 |
+
pitch_normalize_conf: {}
|
529 |
+
energy_extract: null
|
530 |
+
energy_extract_conf: {}
|
531 |
+
energy_normalize: null
|
532 |
+
energy_normalize_conf: {}
|
533 |
+
required:
|
534 |
+
- output_dir
|
535 |
+
- token_list
|
536 |
+
version: '202301'
|
537 |
+
distributed: false
|
538 |
+
```
|
539 |
+
|
540 |
+
</details>
|
541 |
+
|
542 |
+
|
543 |
+
|
544 |
+
### Citing ESPnet
|
545 |
+
|
546 |
+
```BibTex
|
547 |
+
@inproceedings{watanabe2018espnet,
|
548 |
+
author={Shinji Watanabe and Takaaki Hori and Shigeki Karita and Tomoki Hayashi and Jiro Nishitoba and Yuya Unno and Nelson Yalta and Jahn Heymann and Matthew Wiesner and Nanxin Chen and Adithya Renduchintala and Tsubasa Ochiai},
|
549 |
+
title={{ESPnet}: End-to-End Speech Processing Toolkit},
|
550 |
+
year={2018},
|
551 |
+
booktitle={Proceedings of Interspeech},
|
552 |
+
pages={2207--2211},
|
553 |
+
doi={10.21437/Interspeech.2018-1456},
|
554 |
+
url={http://dx.doi.org/10.21437/Interspeech.2018-1456}
|
555 |
+
}
|
556 |
+
|
557 |
+
|
558 |
+
|
559 |
+
|
560 |
+
@inproceedings{hayashi2020espnet,
|
561 |
+
title={{Espnet-TTS}: Unified, reproducible, and integratable open source end-to-end text-to-speech toolkit},
|
562 |
+
author={Hayashi, Tomoki and Yamamoto, Ryuichi and Inoue, Katsuki and Yoshimura, Takenori and Watanabe, Shinji and Toda, Tomoki and Takeda, Kazuya and Zhang, Yu and Tan, Xu},
|
563 |
+
booktitle={Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
|
564 |
+
pages={7654--7658},
|
565 |
+
year={2020},
|
566 |
+
organization={IEEE}
|
567 |
+
}
|
568 |
+
```
|
569 |
+
|
570 |
+
or arXiv:
|
571 |
+
|
572 |
+
```bibtex
|
573 |
+
@misc{watanabe2018espnet,
|
574 |
+
title={ESPnet: End-to-End Speech Processing Toolkit},
|
575 |
+
author={Shinji Watanabe and Takaaki Hori and Shigeki Karita and Tomoki Hayashi and Jiro Nishitoba and Yuya Unno and Nelson Yalta and Jahn Heymann and Matthew Wiesner and Nanxin Chen and Adithya Renduchintala and Tsubasa Ochiai},
|
576 |
+
year={2018},
|
577 |
+
eprint={1804.00015},
|
578 |
+
archivePrefix={arXiv},
|
579 |
+
primaryClass={cs.CL}
|
580 |
+
}
|
581 |
+
```
|
dump/22k/xvector/dev/spk_xvector.ark
ADDED
Binary file (291 kB). View file
|
|
dump/22k/xvector/dev/spk_xvector.scp
ADDED
@@ -0,0 +1,141 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
SSB0009 dump/22k/xvector/dev/spk_xvector.ark:8
|
2 |
+
SSB0011 dump/22k/xvector/dev/spk_xvector.ark:2074
|
3 |
+
SSB0012 dump/22k/xvector/dev/spk_xvector.ark:4140
|
4 |
+
SSB0016 dump/22k/xvector/dev/spk_xvector.ark:6206
|
5 |
+
SSB0018 dump/22k/xvector/dev/spk_xvector.ark:8272
|
6 |
+
SSB0033 dump/22k/xvector/dev/spk_xvector.ark:10338
|
7 |
+
SSB0038 dump/22k/xvector/dev/spk_xvector.ark:12404
|
8 |
+
SSB0043 dump/22k/xvector/dev/spk_xvector.ark:14470
|
9 |
+
SSB0057 dump/22k/xvector/dev/spk_xvector.ark:16536
|
10 |
+
SSB0073 dump/22k/xvector/dev/spk_xvector.ark:18602
|
11 |
+
SSB0080 dump/22k/xvector/dev/spk_xvector.ark:20668
|
12 |
+
SSB0112 dump/22k/xvector/dev/spk_xvector.ark:22734
|
13 |
+
SSB0122 dump/22k/xvector/dev/spk_xvector.ark:24800
|
14 |
+
SSB0133 dump/22k/xvector/dev/spk_xvector.ark:26866
|
15 |
+
SSB0139 dump/22k/xvector/dev/spk_xvector.ark:28932
|
16 |
+
SSB0145 dump/22k/xvector/dev/spk_xvector.ark:30998
|
17 |
+
SSB0149 dump/22k/xvector/dev/spk_xvector.ark:33064
|
18 |
+
SSB0193 dump/22k/xvector/dev/spk_xvector.ark:35130
|
19 |
+
SSB0197 dump/22k/xvector/dev/spk_xvector.ark:37196
|
20 |
+
SSB0200 dump/22k/xvector/dev/spk_xvector.ark:39262
|
21 |
+
SSB0241 dump/22k/xvector/dev/spk_xvector.ark:41328
|
22 |
+
SSB0246 dump/22k/xvector/dev/spk_xvector.ark:43394
|
23 |
+
SSB0261 dump/22k/xvector/dev/spk_xvector.ark:45460
|
24 |
+
SSB0267 dump/22k/xvector/dev/spk_xvector.ark:47526
|
25 |
+
SSB0273 dump/22k/xvector/dev/spk_xvector.ark:49592
|
26 |
+
SSB0287 dump/22k/xvector/dev/spk_xvector.ark:51658
|
27 |
+
SSB0288 dump/22k/xvector/dev/spk_xvector.ark:53724
|
28 |
+
SSB0299 dump/22k/xvector/dev/spk_xvector.ark:55790
|
29 |
+
SSB0307 dump/22k/xvector/dev/spk_xvector.ark:57856
|
30 |
+
SSB0309 dump/22k/xvector/dev/spk_xvector.ark:59922
|
31 |
+
SSB0315 dump/22k/xvector/dev/spk_xvector.ark:61988
|
32 |
+
SSB0316 dump/22k/xvector/dev/spk_xvector.ark:64054
|
33 |
+
SSB0323 dump/22k/xvector/dev/spk_xvector.ark:66120
|
34 |
+
SSB0338 dump/22k/xvector/dev/spk_xvector.ark:68186
|
35 |
+
SSB0339 dump/22k/xvector/dev/spk_xvector.ark:70252
|
36 |
+
SSB0341 dump/22k/xvector/dev/spk_xvector.ark:72318
|
37 |
+
SSB0342 dump/22k/xvector/dev/spk_xvector.ark:74384
|
38 |
+
SSB0354 dump/22k/xvector/dev/spk_xvector.ark:76450
|
39 |
+
SSB0366 dump/22k/xvector/dev/spk_xvector.ark:78516
|
40 |
+
SSB0375 dump/22k/xvector/dev/spk_xvector.ark:80582
|
41 |
+
SSB0379 dump/22k/xvector/dev/spk_xvector.ark:82648
|
42 |
+
SSB0380 dump/22k/xvector/dev/spk_xvector.ark:84714
|
43 |
+
SSB0382 dump/22k/xvector/dev/spk_xvector.ark:86780
|
44 |
+
SSB0385 dump/22k/xvector/dev/spk_xvector.ark:88846
|
45 |
+
SSB0393 dump/22k/xvector/dev/spk_xvector.ark:90912
|
46 |
+
SSB0394 dump/22k/xvector/dev/spk_xvector.ark:92978
|
47 |
+
SSB0395 dump/22k/xvector/dev/spk_xvector.ark:95044
|
48 |
+
SSB0407 dump/22k/xvector/dev/spk_xvector.ark:97110
|
49 |
+
SSB0415 dump/22k/xvector/dev/spk_xvector.ark:99176
|
50 |
+
SSB0426 dump/22k/xvector/dev/spk_xvector.ark:101242
|
51 |
+
SSB0427 dump/22k/xvector/dev/spk_xvector.ark:103308
|
52 |
+
SSB0434 dump/22k/xvector/dev/spk_xvector.ark:105374
|
53 |
+
SSB0435 dump/22k/xvector/dev/spk_xvector.ark:107440
|
54 |
+
SSB0470 dump/22k/xvector/dev/spk_xvector.ark:109506
|
55 |
+
SSB0482 dump/22k/xvector/dev/spk_xvector.ark:111572
|
56 |
+
SSB0502 dump/22k/xvector/dev/spk_xvector.ark:113638
|
57 |
+
SSB0534 dump/22k/xvector/dev/spk_xvector.ark:115704
|
58 |
+
SSB0535 dump/22k/xvector/dev/spk_xvector.ark:117770
|
59 |
+
SSB0539 dump/22k/xvector/dev/spk_xvector.ark:119836
|
60 |
+
SSB0544 dump/22k/xvector/dev/spk_xvector.ark:121902
|
61 |
+
SSB0565 dump/22k/xvector/dev/spk_xvector.ark:123968
|
62 |
+
SSB0570 dump/22k/xvector/dev/spk_xvector.ark:126034
|
63 |
+
SSB0578 dump/22k/xvector/dev/spk_xvector.ark:128100
|
64 |
+
SSB0588 dump/22k/xvector/dev/spk_xvector.ark:130166
|
65 |
+
SSB0590 dump/22k/xvector/dev/spk_xvector.ark:132232
|
66 |
+
SSB0594 dump/22k/xvector/dev/spk_xvector.ark:134298
|
67 |
+
SSB0599 dump/22k/xvector/dev/spk_xvector.ark:136364
|
68 |
+
SSB0601 dump/22k/xvector/dev/spk_xvector.ark:138430
|
69 |
+
SSB0603 dump/22k/xvector/dev/spk_xvector.ark:140496
|
70 |
+
SSB0606 dump/22k/xvector/dev/spk_xvector.ark:142562
|
71 |
+
SSB0607 dump/22k/xvector/dev/spk_xvector.ark:144628
|
72 |
+
SSB0609 dump/22k/xvector/dev/spk_xvector.ark:146694
|
73 |
+
SSB0614 dump/22k/xvector/dev/spk_xvector.ark:148760
|
74 |
+
SSB0623 dump/22k/xvector/dev/spk_xvector.ark:150826
|
75 |
+
SSB0629 dump/22k/xvector/dev/spk_xvector.ark:152892
|
76 |
+
SSB0631 dump/22k/xvector/dev/spk_xvector.ark:154958
|
77 |
+
SSB0632 dump/22k/xvector/dev/spk_xvector.ark:157024
|
78 |
+
SSB0666 dump/22k/xvector/dev/spk_xvector.ark:159090
|
79 |
+
SSB0668 dump/22k/xvector/dev/spk_xvector.ark:161156
|
80 |
+
SSB0671 dump/22k/xvector/dev/spk_xvector.ark:163222
|
81 |
+
SSB0686 dump/22k/xvector/dev/spk_xvector.ark:165288
|
82 |
+
SSB0700 dump/22k/xvector/dev/spk_xvector.ark:167354
|
83 |
+
SSB0710 dump/22k/xvector/dev/spk_xvector.ark:169420
|
84 |
+
SSB0720 dump/22k/xvector/dev/spk_xvector.ark:171486
|
85 |
+
SSB0723 dump/22k/xvector/dev/spk_xvector.ark:173552
|
86 |
+
SSB0737 dump/22k/xvector/dev/spk_xvector.ark:175618
|
87 |
+
SSB0746 dump/22k/xvector/dev/spk_xvector.ark:177684
|
88 |
+
SSB0748 dump/22k/xvector/dev/spk_xvector.ark:179750
|
89 |
+
SSB0751 dump/22k/xvector/dev/spk_xvector.ark:181816
|
90 |
+
SSB0758 dump/22k/xvector/dev/spk_xvector.ark:183882
|
91 |
+
SSB0760 dump/22k/xvector/dev/spk_xvector.ark:185948
|
92 |
+
SSB0762 dump/22k/xvector/dev/spk_xvector.ark:188014
|
93 |
+
SSB0778 dump/22k/xvector/dev/spk_xvector.ark:190080
|
94 |
+
SSB0780 dump/22k/xvector/dev/spk_xvector.ark:192146
|
95 |
+
SSB0784 dump/22k/xvector/dev/spk_xvector.ark:194212
|
96 |
+
SSB0786 dump/22k/xvector/dev/spk_xvector.ark:196278
|
97 |
+
SSB0794 dump/22k/xvector/dev/spk_xvector.ark:198344
|
98 |
+
SSB0913 dump/22k/xvector/dev/spk_xvector.ark:200410
|
99 |
+
SSB0919 dump/22k/xvector/dev/spk_xvector.ark:202476
|
100 |
+
SSB0935 dump/22k/xvector/dev/spk_xvector.ark:204542
|
101 |
+
SSB0966 dump/22k/xvector/dev/spk_xvector.ark:206608
|
102 |
+
SSB0987 dump/22k/xvector/dev/spk_xvector.ark:208674
|
103 |
+
SSB1020 dump/22k/xvector/dev/spk_xvector.ark:210740
|
104 |
+
SSB1024 dump/22k/xvector/dev/spk_xvector.ark:212806
|
105 |
+
SSB1056 dump/22k/xvector/dev/spk_xvector.ark:214872
|
106 |
+
SSB1072 dump/22k/xvector/dev/spk_xvector.ark:216938
|
107 |
+
SSB1096 dump/22k/xvector/dev/spk_xvector.ark:219004
|
108 |
+
SSB1100 dump/22k/xvector/dev/spk_xvector.ark:221070
|
109 |
+
SSB1115 dump/22k/xvector/dev/spk_xvector.ark:223136
|
110 |
+
SSB1125 dump/22k/xvector/dev/spk_xvector.ark:225202
|
111 |
+
SSB1131 dump/22k/xvector/dev/spk_xvector.ark:227268
|
112 |
+
SSB1136 dump/22k/xvector/dev/spk_xvector.ark:229334
|
113 |
+
SSB1161 dump/22k/xvector/dev/spk_xvector.ark:231400
|
114 |
+
SSB1203 dump/22k/xvector/dev/spk_xvector.ark:233466
|
115 |
+
SSB1218 dump/22k/xvector/dev/spk_xvector.ark:235532
|
116 |
+
SSB1253 dump/22k/xvector/dev/spk_xvector.ark:237598
|
117 |
+
SSB1341 dump/22k/xvector/dev/spk_xvector.ark:239664
|
118 |
+
SSB1366 dump/22k/xvector/dev/spk_xvector.ark:241730
|
119 |
+
SSB1448 dump/22k/xvector/dev/spk_xvector.ark:243796
|
120 |
+
SSB1555 dump/22k/xvector/dev/spk_xvector.ark:245862
|
121 |
+
SSB1563 dump/22k/xvector/dev/spk_xvector.ark:247928
|
122 |
+
SSB1567 dump/22k/xvector/dev/spk_xvector.ark:249994
|
123 |
+
SSB1575 dump/22k/xvector/dev/spk_xvector.ark:252060
|
124 |
+
SSB1585 dump/22k/xvector/dev/spk_xvector.ark:254126
|
125 |
+
SSB1593 dump/22k/xvector/dev/spk_xvector.ark:256192
|
126 |
+
SSB1607 dump/22k/xvector/dev/spk_xvector.ark:258258
|
127 |
+
SSB1624 dump/22k/xvector/dev/spk_xvector.ark:260324
|
128 |
+
SSB1625 dump/22k/xvector/dev/spk_xvector.ark:262390
|
129 |
+
SSB1650 dump/22k/xvector/dev/spk_xvector.ark:264456
|
130 |
+
SSB1670 dump/22k/xvector/dev/spk_xvector.ark:266522
|
131 |
+
SSB1684 dump/22k/xvector/dev/spk_xvector.ark:268588
|
132 |
+
SSB1686 dump/22k/xvector/dev/spk_xvector.ark:270654
|
133 |
+
SSB1699 dump/22k/xvector/dev/spk_xvector.ark:272720
|
134 |
+
SSB1711 dump/22k/xvector/dev/spk_xvector.ark:274786
|
135 |
+
SSB1828 dump/22k/xvector/dev/spk_xvector.ark:276852
|
136 |
+
SSB1832 dump/22k/xvector/dev/spk_xvector.ark:278918
|
137 |
+
SSB1837 dump/22k/xvector/dev/spk_xvector.ark:280984
|
138 |
+
SSB1863 dump/22k/xvector/dev/spk_xvector.ark:283050
|
139 |
+
SSB1935 dump/22k/xvector/dev/spk_xvector.ark:285116
|
140 |
+
SSB1939 dump/22k/xvector/dev/spk_xvector.ark:287182
|
141 |
+
SSB1956 dump/22k/xvector/dev/spk_xvector.ark:289248
|
dump/22k/xvector/test/spk_xvector.ark
ADDED
Binary file (442 kB). View file
|
|
dump/22k/xvector/test/spk_xvector.scp
ADDED
@@ -0,0 +1,214 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
SSB0005 dump/22k/xvector/test/spk_xvector.ark:8
|
2 |
+
SSB0009 dump/22k/xvector/test/spk_xvector.ark:2074
|
3 |
+
SSB0011 dump/22k/xvector/test/spk_xvector.ark:4140
|
4 |
+
SSB0012 dump/22k/xvector/test/spk_xvector.ark:6206
|
5 |
+
SSB0016 dump/22k/xvector/test/spk_xvector.ark:8272
|
6 |
+
SSB0018 dump/22k/xvector/test/spk_xvector.ark:10338
|
7 |
+
SSB0033 dump/22k/xvector/test/spk_xvector.ark:12404
|
8 |
+
SSB0038 dump/22k/xvector/test/spk_xvector.ark:14470
|
9 |
+
SSB0043 dump/22k/xvector/test/spk_xvector.ark:16536
|
10 |
+
SSB0057 dump/22k/xvector/test/spk_xvector.ark:18602
|
11 |
+
SSB0073 dump/22k/xvector/test/spk_xvector.ark:20668
|
12 |
+
SSB0080 dump/22k/xvector/test/spk_xvector.ark:22734
|
13 |
+
SSB0112 dump/22k/xvector/test/spk_xvector.ark:24800
|
14 |
+
SSB0122 dump/22k/xvector/test/spk_xvector.ark:26866
|
15 |
+
SSB0133 dump/22k/xvector/test/spk_xvector.ark:28932
|
16 |
+
SSB0139 dump/22k/xvector/test/spk_xvector.ark:30998
|
17 |
+
SSB0145 dump/22k/xvector/test/spk_xvector.ark:33064
|
18 |
+
SSB0149 dump/22k/xvector/test/spk_xvector.ark:35130
|
19 |
+
SSB0193 dump/22k/xvector/test/spk_xvector.ark:37196
|
20 |
+
SSB0197 dump/22k/xvector/test/spk_xvector.ark:39262
|
21 |
+
SSB0200 dump/22k/xvector/test/spk_xvector.ark:41328
|
22 |
+
SSB0241 dump/22k/xvector/test/spk_xvector.ark:43394
|
23 |
+
SSB0246 dump/22k/xvector/test/spk_xvector.ark:45460
|
24 |
+
SSB0261 dump/22k/xvector/test/spk_xvector.ark:47526
|
25 |
+
SSB0267 dump/22k/xvector/test/spk_xvector.ark:49592
|
26 |
+
SSB0273 dump/22k/xvector/test/spk_xvector.ark:51658
|
27 |
+
SSB0287 dump/22k/xvector/test/spk_xvector.ark:53724
|
28 |
+
SSB0288 dump/22k/xvector/test/spk_xvector.ark:55790
|
29 |
+
SSB0299 dump/22k/xvector/test/spk_xvector.ark:57856
|
30 |
+
SSB0307 dump/22k/xvector/test/spk_xvector.ark:59922
|
31 |
+
SSB0309 dump/22k/xvector/test/spk_xvector.ark:61988
|
32 |
+
SSB0315 dump/22k/xvector/test/spk_xvector.ark:64054
|
33 |
+
SSB0316 dump/22k/xvector/test/spk_xvector.ark:66120
|
34 |
+
SSB0323 dump/22k/xvector/test/spk_xvector.ark:68186
|
35 |
+
SSB0338 dump/22k/xvector/test/spk_xvector.ark:70252
|
36 |
+
SSB0339 dump/22k/xvector/test/spk_xvector.ark:72318
|
37 |
+
SSB0341 dump/22k/xvector/test/spk_xvector.ark:74384
|
38 |
+
SSB0342 dump/22k/xvector/test/spk_xvector.ark:76450
|
39 |
+
SSB0354 dump/22k/xvector/test/spk_xvector.ark:78516
|
40 |
+
SSB0366 dump/22k/xvector/test/spk_xvector.ark:80582
|
41 |
+
SSB0375 dump/22k/xvector/test/spk_xvector.ark:82648
|
42 |
+
SSB0379 dump/22k/xvector/test/spk_xvector.ark:84714
|
43 |
+
SSB0380 dump/22k/xvector/test/spk_xvector.ark:86780
|
44 |
+
SSB0382 dump/22k/xvector/test/spk_xvector.ark:88846
|
45 |
+
SSB0385 dump/22k/xvector/test/spk_xvector.ark:90912
|
46 |
+
SSB0393 dump/22k/xvector/test/spk_xvector.ark:92978
|
47 |
+
SSB0394 dump/22k/xvector/test/spk_xvector.ark:95044
|
48 |
+
SSB0395 dump/22k/xvector/test/spk_xvector.ark:97110
|
49 |
+
SSB0407 dump/22k/xvector/test/spk_xvector.ark:99176
|
50 |
+
SSB0415 dump/22k/xvector/test/spk_xvector.ark:101242
|
51 |
+
SSB0427 dump/22k/xvector/test/spk_xvector.ark:103308
|
52 |
+
SSB0434 dump/22k/xvector/test/spk_xvector.ark:105374
|
53 |
+
SSB0435 dump/22k/xvector/test/spk_xvector.ark:107440
|
54 |
+
SSB0470 dump/22k/xvector/test/spk_xvector.ark:109506
|
55 |
+
SSB0482 dump/22k/xvector/test/spk_xvector.ark:111572
|
56 |
+
SSB0502 dump/22k/xvector/test/spk_xvector.ark:113638
|
57 |
+
SSB0534 dump/22k/xvector/test/spk_xvector.ark:115704
|
58 |
+
SSB0535 dump/22k/xvector/test/spk_xvector.ark:117770
|
59 |
+
SSB0539 dump/22k/xvector/test/spk_xvector.ark:119836
|
60 |
+
SSB0544 dump/22k/xvector/test/spk_xvector.ark:121902
|
61 |
+
SSB0565 dump/22k/xvector/test/spk_xvector.ark:123968
|
62 |
+
SSB0570 dump/22k/xvector/test/spk_xvector.ark:126034
|
63 |
+
SSB0578 dump/22k/xvector/test/spk_xvector.ark:128100
|
64 |
+
SSB0588 dump/22k/xvector/test/spk_xvector.ark:130166
|
65 |
+
SSB0590 dump/22k/xvector/test/spk_xvector.ark:132232
|
66 |
+
SSB0594 dump/22k/xvector/test/spk_xvector.ark:134298
|
67 |
+
SSB0599 dump/22k/xvector/test/spk_xvector.ark:136364
|
68 |
+
SSB0601 dump/22k/xvector/test/spk_xvector.ark:138430
|
69 |
+
SSB0603 dump/22k/xvector/test/spk_xvector.ark:140496
|
70 |
+
SSB0606 dump/22k/xvector/test/spk_xvector.ark:142562
|
71 |
+
SSB0607 dump/22k/xvector/test/spk_xvector.ark:144628
|
72 |
+
SSB0609 dump/22k/xvector/test/spk_xvector.ark:146694
|
73 |
+
SSB0614 dump/22k/xvector/test/spk_xvector.ark:148760
|
74 |
+
SSB0623 dump/22k/xvector/test/spk_xvector.ark:150826
|
75 |
+
SSB0629 dump/22k/xvector/test/spk_xvector.ark:152892
|
76 |
+
SSB0631 dump/22k/xvector/test/spk_xvector.ark:154958
|
77 |
+
SSB0632 dump/22k/xvector/test/spk_xvector.ark:157024
|
78 |
+
SSB0666 dump/22k/xvector/test/spk_xvector.ark:159090
|
79 |
+
SSB0668 dump/22k/xvector/test/spk_xvector.ark:161156
|
80 |
+
SSB0671 dump/22k/xvector/test/spk_xvector.ark:163222
|
81 |
+
SSB0686 dump/22k/xvector/test/spk_xvector.ark:165288
|
82 |
+
SSB0693 dump/22k/xvector/test/spk_xvector.ark:167354
|
83 |
+
SSB0700 dump/22k/xvector/test/spk_xvector.ark:169420
|
84 |
+
SSB0702 dump/22k/xvector/test/spk_xvector.ark:171486
|
85 |
+
SSB0710 dump/22k/xvector/test/spk_xvector.ark:173552
|
86 |
+
SSB0711 dump/22k/xvector/test/spk_xvector.ark:175618
|
87 |
+
SSB0716 dump/22k/xvector/test/spk_xvector.ark:177684
|
88 |
+
SSB0717 dump/22k/xvector/test/spk_xvector.ark:179750
|
89 |
+
SSB0720 dump/22k/xvector/test/spk_xvector.ark:181816
|
90 |
+
SSB0723 dump/22k/xvector/test/spk_xvector.ark:183882
|
91 |
+
SSB0736 dump/22k/xvector/test/spk_xvector.ark:185948
|
92 |
+
SSB0737 dump/22k/xvector/test/spk_xvector.ark:188014
|
93 |
+
SSB0746 dump/22k/xvector/test/spk_xvector.ark:190080
|
94 |
+
SSB0749 dump/22k/xvector/test/spk_xvector.ark:192146
|
95 |
+
SSB0751 dump/22k/xvector/test/spk_xvector.ark:194212
|
96 |
+
SSB0758 dump/22k/xvector/test/spk_xvector.ark:196278
|
97 |
+
SSB0760 dump/22k/xvector/test/spk_xvector.ark:198344
|
98 |
+
SSB0762 dump/22k/xvector/test/spk_xvector.ark:200410
|
99 |
+
SSB0778 dump/22k/xvector/test/spk_xvector.ark:202476
|
100 |
+
SSB0780 dump/22k/xvector/test/spk_xvector.ark:204542
|
101 |
+
SSB0784 dump/22k/xvector/test/spk_xvector.ark:206608
|
102 |
+
SSB0786 dump/22k/xvector/test/spk_xvector.ark:208674
|
103 |
+
SSB0794 dump/22k/xvector/test/spk_xvector.ark:210740
|
104 |
+
SSB0809 dump/22k/xvector/test/spk_xvector.ark:212806
|
105 |
+
SSB0817 dump/22k/xvector/test/spk_xvector.ark:214872
|
106 |
+
SSB0822 dump/22k/xvector/test/spk_xvector.ark:216938
|
107 |
+
SSB0851 dump/22k/xvector/test/spk_xvector.ark:219004
|
108 |
+
SSB0863 dump/22k/xvector/test/spk_xvector.ark:221070
|
109 |
+
SSB0871 dump/22k/xvector/test/spk_xvector.ark:223136
|
110 |
+
SSB0887 dump/22k/xvector/test/spk_xvector.ark:225202
|
111 |
+
SSB0913 dump/22k/xvector/test/spk_xvector.ark:227268
|
112 |
+
SSB0915 dump/22k/xvector/test/spk_xvector.ark:229334
|
113 |
+
SSB0919 dump/22k/xvector/test/spk_xvector.ark:231400
|
114 |
+
SSB0935 dump/22k/xvector/test/spk_xvector.ark:233466
|
115 |
+
SSB0966 dump/22k/xvector/test/spk_xvector.ark:235532
|
116 |
+
SSB0987 dump/22k/xvector/test/spk_xvector.ark:237598
|
117 |
+
SSB0993 dump/22k/xvector/test/spk_xvector.ark:239664
|
118 |
+
SSB0997 dump/22k/xvector/test/spk_xvector.ark:241730
|
119 |
+
SSB1000 dump/22k/xvector/test/spk_xvector.ark:243796
|
120 |
+
SSB1001 dump/22k/xvector/test/spk_xvector.ark:245862
|
121 |
+
SSB1002 dump/22k/xvector/test/spk_xvector.ark:247928
|
122 |
+
SSB1008 dump/22k/xvector/test/spk_xvector.ark:249994
|
123 |
+
SSB1020 dump/22k/xvector/test/spk_xvector.ark:252060
|
124 |
+
SSB1024 dump/22k/xvector/test/spk_xvector.ark:254126
|
125 |
+
SSB1050 dump/22k/xvector/test/spk_xvector.ark:256192
|
126 |
+
SSB1055 dump/22k/xvector/test/spk_xvector.ark:258258
|
127 |
+
SSB1056 dump/22k/xvector/test/spk_xvector.ark:260324
|
128 |
+
SSB1064 dump/22k/xvector/test/spk_xvector.ark:262390
|
129 |
+
SSB1072 dump/22k/xvector/test/spk_xvector.ark:264456
|
130 |
+
SSB1091 dump/22k/xvector/test/spk_xvector.ark:266522
|
131 |
+
SSB1100 dump/22k/xvector/test/spk_xvector.ark:268588
|
132 |
+
SSB1108 dump/22k/xvector/test/spk_xvector.ark:270654
|
133 |
+
SSB1110 dump/22k/xvector/test/spk_xvector.ark:272720
|
134 |
+
SSB1115 dump/22k/xvector/test/spk_xvector.ark:274786
|
135 |
+
SSB1125 dump/22k/xvector/test/spk_xvector.ark:276852
|
136 |
+
SSB1126 dump/22k/xvector/test/spk_xvector.ark:278918
|
137 |
+
SSB1131 dump/22k/xvector/test/spk_xvector.ark:280984
|
138 |
+
SSB1135 dump/22k/xvector/test/spk_xvector.ark:283050
|
139 |
+
SSB1136 dump/22k/xvector/test/spk_xvector.ark:285116
|
140 |
+
SSB1138 dump/22k/xvector/test/spk_xvector.ark:287182
|
141 |
+
SSB1161 dump/22k/xvector/test/spk_xvector.ark:289248
|
142 |
+
SSB1176 dump/22k/xvector/test/spk_xvector.ark:291314
|
143 |
+
SSB1187 dump/22k/xvector/test/spk_xvector.ark:293380
|
144 |
+
SSB1197 dump/22k/xvector/test/spk_xvector.ark:295446
|
145 |
+
SSB1203 dump/22k/xvector/test/spk_xvector.ark:297512
|
146 |
+
SSB1204 dump/22k/xvector/test/spk_xvector.ark:299578
|
147 |
+
SSB1215 dump/22k/xvector/test/spk_xvector.ark:301644
|
148 |
+
SSB1216 dump/22k/xvector/test/spk_xvector.ark:303710
|
149 |
+
SSB1218 dump/22k/xvector/test/spk_xvector.ark:305776
|
150 |
+
SSB1219 dump/22k/xvector/test/spk_xvector.ark:307842
|
151 |
+
SSB1221 dump/22k/xvector/test/spk_xvector.ark:309908
|
152 |
+
SSB1239 dump/22k/xvector/test/spk_xvector.ark:311974
|
153 |
+
SSB1253 dump/22k/xvector/test/spk_xvector.ark:314040
|
154 |
+
SSB1274 dump/22k/xvector/test/spk_xvector.ark:316106
|
155 |
+
SSB1302 dump/22k/xvector/test/spk_xvector.ark:318172
|
156 |
+
SSB1320 dump/22k/xvector/test/spk_xvector.ark:320238
|
157 |
+
SSB1322 dump/22k/xvector/test/spk_xvector.ark:322304
|
158 |
+
SSB1328 dump/22k/xvector/test/spk_xvector.ark:324370
|
159 |
+
SSB1340 dump/22k/xvector/test/spk_xvector.ark:326436
|
160 |
+
SSB1341 dump/22k/xvector/test/spk_xvector.ark:328502
|
161 |
+
SSB1365 dump/22k/xvector/test/spk_xvector.ark:330568
|
162 |
+
SSB1366 dump/22k/xvector/test/spk_xvector.ark:332634
|
163 |
+
SSB1377 dump/22k/xvector/test/spk_xvector.ark:334700
|
164 |
+
SSB1382 dump/22k/xvector/test/spk_xvector.ark:336766
|
165 |
+
SSB1383 dump/22k/xvector/test/spk_xvector.ark:338832
|
166 |
+
SSB1385 dump/22k/xvector/test/spk_xvector.ark:340898
|
167 |
+
SSB1392 dump/22k/xvector/test/spk_xvector.ark:342964
|
168 |
+
SSB1393 dump/22k/xvector/test/spk_xvector.ark:345030
|
169 |
+
SSB1399 dump/22k/xvector/test/spk_xvector.ark:347096
|
170 |
+
SSB1402 dump/22k/xvector/test/spk_xvector.ark:349162
|
171 |
+
SSB1408 dump/22k/xvector/test/spk_xvector.ark:351228
|
172 |
+
SSB1431 dump/22k/xvector/test/spk_xvector.ark:353294
|
173 |
+
SSB1437 dump/22k/xvector/test/spk_xvector.ark:355360
|
174 |
+
SSB1448 dump/22k/xvector/test/spk_xvector.ark:357426
|
175 |
+
SSB1452 dump/22k/xvector/test/spk_xvector.ark:359492
|
176 |
+
SSB1457 dump/22k/xvector/test/spk_xvector.ark:361558
|
177 |
+
SSB1555 dump/22k/xvector/test/spk_xvector.ark:363624
|
178 |
+
SSB1563 dump/22k/xvector/test/spk_xvector.ark:365690
|
179 |
+
SSB1575 dump/22k/xvector/test/spk_xvector.ark:367756
|
180 |
+
SSB1585 dump/22k/xvector/test/spk_xvector.ark:369822
|
181 |
+
SSB1593 dump/22k/xvector/test/spk_xvector.ark:371888
|
182 |
+
SSB1607 dump/22k/xvector/test/spk_xvector.ark:373954
|
183 |
+
SSB1624 dump/22k/xvector/test/spk_xvector.ark:376020
|
184 |
+
SSB1625 dump/22k/xvector/test/spk_xvector.ark:378086
|
185 |
+
SSB1630 dump/22k/xvector/test/spk_xvector.ark:380152
|
186 |
+
SSB1650 dump/22k/xvector/test/spk_xvector.ark:382218
|
187 |
+
SSB1670 dump/22k/xvector/test/spk_xvector.ark:384284
|
188 |
+
SSB1684 dump/22k/xvector/test/spk_xvector.ark:386350
|
189 |
+
SSB1686 dump/22k/xvector/test/spk_xvector.ark:388416
|
190 |
+
SSB1699 dump/22k/xvector/test/spk_xvector.ark:390482
|
191 |
+
SSB1711 dump/22k/xvector/test/spk_xvector.ark:392548
|
192 |
+
SSB1728 dump/22k/xvector/test/spk_xvector.ark:394614
|
193 |
+
SSB1739 dump/22k/xvector/test/spk_xvector.ark:396680
|
194 |
+
SSB1745 dump/22k/xvector/test/spk_xvector.ark:398746
|
195 |
+
SSB1759 dump/22k/xvector/test/spk_xvector.ark:400812
|
196 |
+
SSB1781 dump/22k/xvector/test/spk_xvector.ark:402878
|
197 |
+
SSB1782 dump/22k/xvector/test/spk_xvector.ark:404944
|
198 |
+
SSB1806 dump/22k/xvector/test/spk_xvector.ark:407010
|
199 |
+
SSB1809 dump/22k/xvector/test/spk_xvector.ark:409076
|
200 |
+
SSB1810 dump/22k/xvector/test/spk_xvector.ark:411142
|
201 |
+
SSB1828 dump/22k/xvector/test/spk_xvector.ark:413208
|
202 |
+
SSB1831 dump/22k/xvector/test/spk_xvector.ark:415274
|
203 |
+
SSB1832 dump/22k/xvector/test/spk_xvector.ark:417340
|
204 |
+
SSB1837 dump/22k/xvector/test/spk_xvector.ark:419406
|
205 |
+
SSB1846 dump/22k/xvector/test/spk_xvector.ark:421472
|
206 |
+
SSB1863 dump/22k/xvector/test/spk_xvector.ark:423538
|
207 |
+
SSB1872 dump/22k/xvector/test/spk_xvector.ark:425604
|
208 |
+
SSB1878 dump/22k/xvector/test/spk_xvector.ark:427670
|
209 |
+
SSB1891 dump/22k/xvector/test/spk_xvector.ark:429736
|
210 |
+
SSB1902 dump/22k/xvector/test/spk_xvector.ark:431802
|
211 |
+
SSB1918 dump/22k/xvector/test/spk_xvector.ark:433868
|
212 |
+
SSB1935 dump/22k/xvector/test/spk_xvector.ark:435934
|
213 |
+
SSB1939 dump/22k/xvector/test/spk_xvector.ark:438000
|
214 |
+
SSB1956 dump/22k/xvector/test/spk_xvector.ark:440066
|
dump/22k/xvector/train_no_dev/spk_xvector.ark
ADDED
Binary file (359 kB). View file
|
|
dump/22k/xvector/train_no_dev/spk_xvector.scp
ADDED
@@ -0,0 +1,174 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
SSB0005 dump/22k/xvector/train_no_dev/spk_xvector.ark:8
|
2 |
+
SSB0009 dump/22k/xvector/train_no_dev/spk_xvector.ark:2074
|
3 |
+
SSB0011 dump/22k/xvector/train_no_dev/spk_xvector.ark:4140
|
4 |
+
SSB0012 dump/22k/xvector/train_no_dev/spk_xvector.ark:6206
|
5 |
+
SSB0016 dump/22k/xvector/train_no_dev/spk_xvector.ark:8272
|
6 |
+
SSB0018 dump/22k/xvector/train_no_dev/spk_xvector.ark:10338
|
7 |
+
SSB0033 dump/22k/xvector/train_no_dev/spk_xvector.ark:12404
|
8 |
+
SSB0038 dump/22k/xvector/train_no_dev/spk_xvector.ark:14470
|
9 |
+
SSB0043 dump/22k/xvector/train_no_dev/spk_xvector.ark:16536
|
10 |
+
SSB0057 dump/22k/xvector/train_no_dev/spk_xvector.ark:18602
|
11 |
+
SSB0073 dump/22k/xvector/train_no_dev/spk_xvector.ark:20668
|
12 |
+
SSB0080 dump/22k/xvector/train_no_dev/spk_xvector.ark:22734
|
13 |
+
SSB0112 dump/22k/xvector/train_no_dev/spk_xvector.ark:24800
|
14 |
+
SSB0122 dump/22k/xvector/train_no_dev/spk_xvector.ark:26866
|
15 |
+
SSB0133 dump/22k/xvector/train_no_dev/spk_xvector.ark:28932
|
16 |
+
SSB0139 dump/22k/xvector/train_no_dev/spk_xvector.ark:30998
|
17 |
+
SSB0145 dump/22k/xvector/train_no_dev/spk_xvector.ark:33064
|
18 |
+
SSB0149 dump/22k/xvector/train_no_dev/spk_xvector.ark:35130
|
19 |
+
SSB0193 dump/22k/xvector/train_no_dev/spk_xvector.ark:37196
|
20 |
+
SSB0197 dump/22k/xvector/train_no_dev/spk_xvector.ark:39262
|
21 |
+
SSB0200 dump/22k/xvector/train_no_dev/spk_xvector.ark:41328
|
22 |
+
SSB0241 dump/22k/xvector/train_no_dev/spk_xvector.ark:43394
|
23 |
+
SSB0246 dump/22k/xvector/train_no_dev/spk_xvector.ark:45460
|
24 |
+
SSB0261 dump/22k/xvector/train_no_dev/spk_xvector.ark:47526
|
25 |
+
SSB0267 dump/22k/xvector/train_no_dev/spk_xvector.ark:49592
|
26 |
+
SSB0273 dump/22k/xvector/train_no_dev/spk_xvector.ark:51658
|
27 |
+
SSB0287 dump/22k/xvector/train_no_dev/spk_xvector.ark:53724
|
28 |
+
SSB0288 dump/22k/xvector/train_no_dev/spk_xvector.ark:55790
|
29 |
+
SSB0299 dump/22k/xvector/train_no_dev/spk_xvector.ark:57856
|
30 |
+
SSB0307 dump/22k/xvector/train_no_dev/spk_xvector.ark:59922
|
31 |
+
SSB0309 dump/22k/xvector/train_no_dev/spk_xvector.ark:61988
|
32 |
+
SSB0315 dump/22k/xvector/train_no_dev/spk_xvector.ark:64054
|
33 |
+
SSB0316 dump/22k/xvector/train_no_dev/spk_xvector.ark:66120
|
34 |
+
SSB0323 dump/22k/xvector/train_no_dev/spk_xvector.ark:68186
|
35 |
+
SSB0338 dump/22k/xvector/train_no_dev/spk_xvector.ark:70252
|
36 |
+
SSB0339 dump/22k/xvector/train_no_dev/spk_xvector.ark:72318
|
37 |
+
SSB0341 dump/22k/xvector/train_no_dev/spk_xvector.ark:74384
|
38 |
+
SSB0342 dump/22k/xvector/train_no_dev/spk_xvector.ark:76450
|
39 |
+
SSB0354 dump/22k/xvector/train_no_dev/spk_xvector.ark:78516
|
40 |
+
SSB0366 dump/22k/xvector/train_no_dev/spk_xvector.ark:80582
|
41 |
+
SSB0375 dump/22k/xvector/train_no_dev/spk_xvector.ark:82648
|
42 |
+
SSB0379 dump/22k/xvector/train_no_dev/spk_xvector.ark:84714
|
43 |
+
SSB0380 dump/22k/xvector/train_no_dev/spk_xvector.ark:86780
|
44 |
+
SSB0382 dump/22k/xvector/train_no_dev/spk_xvector.ark:88846
|
45 |
+
SSB0385 dump/22k/xvector/train_no_dev/spk_xvector.ark:90912
|
46 |
+
SSB0393 dump/22k/xvector/train_no_dev/spk_xvector.ark:92978
|
47 |
+
SSB0394 dump/22k/xvector/train_no_dev/spk_xvector.ark:95044
|
48 |
+
SSB0395 dump/22k/xvector/train_no_dev/spk_xvector.ark:97110
|
49 |
+
SSB0407 dump/22k/xvector/train_no_dev/spk_xvector.ark:99176
|
50 |
+
SSB0415 dump/22k/xvector/train_no_dev/spk_xvector.ark:101242
|
51 |
+
SSB0426 dump/22k/xvector/train_no_dev/spk_xvector.ark:103308
|
52 |
+
SSB0427 dump/22k/xvector/train_no_dev/spk_xvector.ark:105374
|
53 |
+
SSB0434 dump/22k/xvector/train_no_dev/spk_xvector.ark:107440
|
54 |
+
SSB0435 dump/22k/xvector/train_no_dev/spk_xvector.ark:109506
|
55 |
+
SSB0470 dump/22k/xvector/train_no_dev/spk_xvector.ark:111572
|
56 |
+
SSB0482 dump/22k/xvector/train_no_dev/spk_xvector.ark:113638
|
57 |
+
SSB0502 dump/22k/xvector/train_no_dev/spk_xvector.ark:115704
|
58 |
+
SSB0534 dump/22k/xvector/train_no_dev/spk_xvector.ark:117770
|
59 |
+
SSB0535 dump/22k/xvector/train_no_dev/spk_xvector.ark:119836
|
60 |
+
SSB0539 dump/22k/xvector/train_no_dev/spk_xvector.ark:121902
|
61 |
+
SSB0544 dump/22k/xvector/train_no_dev/spk_xvector.ark:123968
|
62 |
+
SSB0565 dump/22k/xvector/train_no_dev/spk_xvector.ark:126034
|
63 |
+
SSB0570 dump/22k/xvector/train_no_dev/spk_xvector.ark:128100
|
64 |
+
SSB0578 dump/22k/xvector/train_no_dev/spk_xvector.ark:130166
|
65 |
+
SSB0588 dump/22k/xvector/train_no_dev/spk_xvector.ark:132232
|
66 |
+
SSB0590 dump/22k/xvector/train_no_dev/spk_xvector.ark:134298
|
67 |
+
SSB0594 dump/22k/xvector/train_no_dev/spk_xvector.ark:136364
|
68 |
+
SSB0599 dump/22k/xvector/train_no_dev/spk_xvector.ark:138430
|
69 |
+
SSB0601 dump/22k/xvector/train_no_dev/spk_xvector.ark:140496
|
70 |
+
SSB0603 dump/22k/xvector/train_no_dev/spk_xvector.ark:142562
|
71 |
+
SSB0606 dump/22k/xvector/train_no_dev/spk_xvector.ark:144628
|
72 |
+
SSB0607 dump/22k/xvector/train_no_dev/spk_xvector.ark:146694
|
73 |
+
SSB0609 dump/22k/xvector/train_no_dev/spk_xvector.ark:148760
|
74 |
+
SSB0614 dump/22k/xvector/train_no_dev/spk_xvector.ark:150826
|
75 |
+
SSB0623 dump/22k/xvector/train_no_dev/spk_xvector.ark:152892
|
76 |
+
SSB0629 dump/22k/xvector/train_no_dev/spk_xvector.ark:154958
|
77 |
+
SSB0631 dump/22k/xvector/train_no_dev/spk_xvector.ark:157024
|
78 |
+
SSB0632 dump/22k/xvector/train_no_dev/spk_xvector.ark:159090
|
79 |
+
SSB0666 dump/22k/xvector/train_no_dev/spk_xvector.ark:161156
|
80 |
+
SSB0668 dump/22k/xvector/train_no_dev/spk_xvector.ark:163222
|
81 |
+
SSB0671 dump/22k/xvector/train_no_dev/spk_xvector.ark:165288
|
82 |
+
SSB0686 dump/22k/xvector/train_no_dev/spk_xvector.ark:167354
|
83 |
+
SSB0700 dump/22k/xvector/train_no_dev/spk_xvector.ark:169420
|
84 |
+
SSB0710 dump/22k/xvector/train_no_dev/spk_xvector.ark:171486
|
85 |
+
SSB0720 dump/22k/xvector/train_no_dev/spk_xvector.ark:173552
|
86 |
+
SSB0723 dump/22k/xvector/train_no_dev/spk_xvector.ark:175618
|
87 |
+
SSB0737 dump/22k/xvector/train_no_dev/spk_xvector.ark:177684
|
88 |
+
SSB0746 dump/22k/xvector/train_no_dev/spk_xvector.ark:179750
|
89 |
+
SSB0748 dump/22k/xvector/train_no_dev/spk_xvector.ark:181816
|
90 |
+
SSB0751 dump/22k/xvector/train_no_dev/spk_xvector.ark:183882
|
91 |
+
SSB0758 dump/22k/xvector/train_no_dev/spk_xvector.ark:185948
|
92 |
+
SSB0760 dump/22k/xvector/train_no_dev/spk_xvector.ark:188014
|
93 |
+
SSB0762 dump/22k/xvector/train_no_dev/spk_xvector.ark:190080
|
94 |
+
SSB0778 dump/22k/xvector/train_no_dev/spk_xvector.ark:192146
|
95 |
+
SSB0780 dump/22k/xvector/train_no_dev/spk_xvector.ark:194212
|
96 |
+
SSB0784 dump/22k/xvector/train_no_dev/spk_xvector.ark:196278
|
97 |
+
SSB0786 dump/22k/xvector/train_no_dev/spk_xvector.ark:198344
|
98 |
+
SSB0794 dump/22k/xvector/train_no_dev/spk_xvector.ark:200410
|
99 |
+
SSB0817 dump/22k/xvector/train_no_dev/spk_xvector.ark:202476
|
100 |
+
SSB0851 dump/22k/xvector/train_no_dev/spk_xvector.ark:204542
|
101 |
+
SSB0863 dump/22k/xvector/train_no_dev/spk_xvector.ark:206608
|
102 |
+
SSB0871 dump/22k/xvector/train_no_dev/spk_xvector.ark:208674
|
103 |
+
SSB0887 dump/22k/xvector/train_no_dev/spk_xvector.ark:210740
|
104 |
+
SSB0913 dump/22k/xvector/train_no_dev/spk_xvector.ark:212806
|
105 |
+
SSB0915 dump/22k/xvector/train_no_dev/spk_xvector.ark:214872
|
106 |
+
SSB0919 dump/22k/xvector/train_no_dev/spk_xvector.ark:216938
|
107 |
+
SSB0935 dump/22k/xvector/train_no_dev/spk_xvector.ark:219004
|
108 |
+
SSB0966 dump/22k/xvector/train_no_dev/spk_xvector.ark:221070
|
109 |
+
SSB0987 dump/22k/xvector/train_no_dev/spk_xvector.ark:223136
|
110 |
+
SSB1008 dump/22k/xvector/train_no_dev/spk_xvector.ark:225202
|
111 |
+
SSB1020 dump/22k/xvector/train_no_dev/spk_xvector.ark:227268
|
112 |
+
SSB1024 dump/22k/xvector/train_no_dev/spk_xvector.ark:229334
|
113 |
+
SSB1050 dump/22k/xvector/train_no_dev/spk_xvector.ark:231400
|
114 |
+
SSB1055 dump/22k/xvector/train_no_dev/spk_xvector.ark:233466
|
115 |
+
SSB1056 dump/22k/xvector/train_no_dev/spk_xvector.ark:235532
|
116 |
+
SSB1064 dump/22k/xvector/train_no_dev/spk_xvector.ark:237598
|
117 |
+
SSB1072 dump/22k/xvector/train_no_dev/spk_xvector.ark:239664
|
118 |
+
SSB1091 dump/22k/xvector/train_no_dev/spk_xvector.ark:241730
|
119 |
+
SSB1096 dump/22k/xvector/train_no_dev/spk_xvector.ark:243796
|
120 |
+
SSB1100 dump/22k/xvector/train_no_dev/spk_xvector.ark:245862
|
121 |
+
SSB1108 dump/22k/xvector/train_no_dev/spk_xvector.ark:247928
|
122 |
+
SSB1115 dump/22k/xvector/train_no_dev/spk_xvector.ark:249994
|
123 |
+
SSB1125 dump/22k/xvector/train_no_dev/spk_xvector.ark:252060
|
124 |
+
SSB1131 dump/22k/xvector/train_no_dev/spk_xvector.ark:254126
|
125 |
+
SSB1136 dump/22k/xvector/train_no_dev/spk_xvector.ark:256192
|
126 |
+
SSB1138 dump/22k/xvector/train_no_dev/spk_xvector.ark:258258
|
127 |
+
SSB1161 dump/22k/xvector/train_no_dev/spk_xvector.ark:260324
|
128 |
+
SSB1203 dump/22k/xvector/train_no_dev/spk_xvector.ark:262390
|
129 |
+
SSB1204 dump/22k/xvector/train_no_dev/spk_xvector.ark:264456
|
130 |
+
SSB1218 dump/22k/xvector/train_no_dev/spk_xvector.ark:266522
|
131 |
+
SSB1221 dump/22k/xvector/train_no_dev/spk_xvector.ark:268588
|
132 |
+
SSB1253 dump/22k/xvector/train_no_dev/spk_xvector.ark:270654
|
133 |
+
SSB1320 dump/22k/xvector/train_no_dev/spk_xvector.ark:272720
|
134 |
+
SSB1341 dump/22k/xvector/train_no_dev/spk_xvector.ark:274786
|
135 |
+
SSB1366 dump/22k/xvector/train_no_dev/spk_xvector.ark:276852
|
136 |
+
SSB1377 dump/22k/xvector/train_no_dev/spk_xvector.ark:278918
|
137 |
+
SSB1383 dump/22k/xvector/train_no_dev/spk_xvector.ark:280984
|
138 |
+
SSB1385 dump/22k/xvector/train_no_dev/spk_xvector.ark:283050
|
139 |
+
SSB1392 dump/22k/xvector/train_no_dev/spk_xvector.ark:285116
|
140 |
+
SSB1393 dump/22k/xvector/train_no_dev/spk_xvector.ark:287182
|
141 |
+
SSB1408 dump/22k/xvector/train_no_dev/spk_xvector.ark:289248
|
142 |
+
SSB1431 dump/22k/xvector/train_no_dev/spk_xvector.ark:291314
|
143 |
+
SSB1437 dump/22k/xvector/train_no_dev/spk_xvector.ark:293380
|
144 |
+
SSB1448 dump/22k/xvector/train_no_dev/spk_xvector.ark:295446
|
145 |
+
SSB1555 dump/22k/xvector/train_no_dev/spk_xvector.ark:297512
|
146 |
+
SSB1563 dump/22k/xvector/train_no_dev/spk_xvector.ark:299578
|
147 |
+
SSB1567 dump/22k/xvector/train_no_dev/spk_xvector.ark:301644
|
148 |
+
SSB1575 dump/22k/xvector/train_no_dev/spk_xvector.ark:303710
|
149 |
+
SSB1585 dump/22k/xvector/train_no_dev/spk_xvector.ark:305776
|
150 |
+
SSB1593 dump/22k/xvector/train_no_dev/spk_xvector.ark:307842
|
151 |
+
SSB1607 dump/22k/xvector/train_no_dev/spk_xvector.ark:309908
|
152 |
+
SSB1624 dump/22k/xvector/train_no_dev/spk_xvector.ark:311974
|
153 |
+
SSB1625 dump/22k/xvector/train_no_dev/spk_xvector.ark:314040
|
154 |
+
SSB1630 dump/22k/xvector/train_no_dev/spk_xvector.ark:316106
|
155 |
+
SSB1650 dump/22k/xvector/train_no_dev/spk_xvector.ark:318172
|
156 |
+
SSB1670 dump/22k/xvector/train_no_dev/spk_xvector.ark:320238
|
157 |
+
SSB1684 dump/22k/xvector/train_no_dev/spk_xvector.ark:322304
|
158 |
+
SSB1686 dump/22k/xvector/train_no_dev/spk_xvector.ark:324370
|
159 |
+
SSB1699 dump/22k/xvector/train_no_dev/spk_xvector.ark:326436
|
160 |
+
SSB1711 dump/22k/xvector/train_no_dev/spk_xvector.ark:328502
|
161 |
+
SSB1759 dump/22k/xvector/train_no_dev/spk_xvector.ark:330568
|
162 |
+
SSB1806 dump/22k/xvector/train_no_dev/spk_xvector.ark:332634
|
163 |
+
SSB1828 dump/22k/xvector/train_no_dev/spk_xvector.ark:334700
|
164 |
+
SSB1831 dump/22k/xvector/train_no_dev/spk_xvector.ark:336766
|
165 |
+
SSB1832 dump/22k/xvector/train_no_dev/spk_xvector.ark:338832
|
166 |
+
SSB1837 dump/22k/xvector/train_no_dev/spk_xvector.ark:340898
|
167 |
+
SSB1846 dump/22k/xvector/train_no_dev/spk_xvector.ark:342964
|
168 |
+
SSB1863 dump/22k/xvector/train_no_dev/spk_xvector.ark:345030
|
169 |
+
SSB1878 dump/22k/xvector/train_no_dev/spk_xvector.ark:347096
|
170 |
+
SSB1891 dump/22k/xvector/train_no_dev/spk_xvector.ark:349162
|
171 |
+
SSB1918 dump/22k/xvector/train_no_dev/spk_xvector.ark:351228
|
172 |
+
SSB1935 dump/22k/xvector/train_no_dev/spk_xvector.ark:353294
|
173 |
+
SSB1939 dump/22k/xvector/train_no_dev/spk_xvector.ark:355360
|
174 |
+
SSB1956 dump/22k/xvector/train_no_dev/spk_xvector.ark:357426
|
exp/22k/tts_train_raw_phn_pypinyin_g2p_phone/600epoch.pth
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:81780ea1f480ba58739dedf160969a33d8bb483daf58cca0b767a55de61abba7
|
3 |
+
size 386540402
|
exp/22k/tts_train_raw_phn_pypinyin_g2p_phone/config.yaml
ADDED
@@ -0,0 +1,500 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
config: conf/train.yaml
|
2 |
+
print_config: false
|
3 |
+
log_level: INFO
|
4 |
+
dry_run: false
|
5 |
+
iterator_type: sequence
|
6 |
+
output_dir: exp/22k/tts_train_raw_phn_pypinyin_g2p_phone
|
7 |
+
ngpu: 1
|
8 |
+
seed: 777
|
9 |
+
num_workers: 4
|
10 |
+
num_att_plot: 3
|
11 |
+
dist_backend: nccl
|
12 |
+
dist_init_method: env://
|
13 |
+
dist_world_size: null
|
14 |
+
dist_rank: null
|
15 |
+
local_rank: 0
|
16 |
+
dist_master_addr: null
|
17 |
+
dist_master_port: null
|
18 |
+
dist_launcher: null
|
19 |
+
multiprocessing_distributed: false
|
20 |
+
unused_parameters: true
|
21 |
+
sharded_ddp: false
|
22 |
+
cudnn_enabled: true
|
23 |
+
cudnn_benchmark: false
|
24 |
+
cudnn_deterministic: false
|
25 |
+
collect_stats: false
|
26 |
+
write_collected_feats: false
|
27 |
+
max_epoch: 1000
|
28 |
+
patience: null
|
29 |
+
val_scheduler_criterion:
|
30 |
+
- valid
|
31 |
+
- loss
|
32 |
+
early_stopping_criterion:
|
33 |
+
- valid
|
34 |
+
- loss
|
35 |
+
- min
|
36 |
+
best_model_criterion:
|
37 |
+
- - train
|
38 |
+
- total_count
|
39 |
+
- max
|
40 |
+
keep_nbest_models: 10
|
41 |
+
nbest_averaging_interval: 0
|
42 |
+
grad_clip: -1
|
43 |
+
grad_clip_type: 2.0
|
44 |
+
grad_noise: false
|
45 |
+
accum_grad: 1
|
46 |
+
no_forward_run: false
|
47 |
+
resume: true
|
48 |
+
train_dtype: float32
|
49 |
+
use_amp: false
|
50 |
+
log_interval: 50
|
51 |
+
use_matplotlib: true
|
52 |
+
use_tensorboard: true
|
53 |
+
create_graph_in_tensorboard: false
|
54 |
+
use_wandb: false
|
55 |
+
wandb_project: null
|
56 |
+
wandb_id: null
|
57 |
+
wandb_entity: null
|
58 |
+
wandb_name: null
|
59 |
+
wandb_model_log_interval: -1
|
60 |
+
detect_anomaly: false
|
61 |
+
pretrain_path: null
|
62 |
+
init_param: []
|
63 |
+
ignore_init_mismatch: false
|
64 |
+
freeze_param: []
|
65 |
+
num_iters_per_epoch: 1000
|
66 |
+
batch_size: 20
|
67 |
+
valid_batch_size: null
|
68 |
+
batch_bins: 1250000
|
69 |
+
valid_batch_bins: null
|
70 |
+
train_shape_file:
|
71 |
+
- exp/22k/tts_stats_raw_linear_spectrogram_phn_pypinyin_g2p_phone/train/text_shape.phn
|
72 |
+
- exp/22k/tts_stats_raw_linear_spectrogram_phn_pypinyin_g2p_phone/train/speech_shape
|
73 |
+
valid_shape_file:
|
74 |
+
- exp/22k/tts_stats_raw_linear_spectrogram_phn_pypinyin_g2p_phone/valid/text_shape.phn
|
75 |
+
- exp/22k/tts_stats_raw_linear_spectrogram_phn_pypinyin_g2p_phone/valid/speech_shape
|
76 |
+
batch_type: numel
|
77 |
+
valid_batch_type: null
|
78 |
+
fold_length:
|
79 |
+
- 150
|
80 |
+
- 204800
|
81 |
+
sort_in_batch: descending
|
82 |
+
sort_batch: descending
|
83 |
+
multiple_iterator: false
|
84 |
+
chunk_length: 500
|
85 |
+
chunk_shift_ratio: 0.5
|
86 |
+
num_cache_chunks: 1024
|
87 |
+
chunk_excluded_key_prefixes: []
|
88 |
+
train_data_path_and_name_and_type:
|
89 |
+
- - dump/22k/raw/train_no_dev/text
|
90 |
+
- text
|
91 |
+
- text
|
92 |
+
- - dump/22k/raw/train_no_dev/wav.scp
|
93 |
+
- speech
|
94 |
+
- sound
|
95 |
+
- - dump/22k/xvector/train_no_dev/xvector.scp
|
96 |
+
- spembs
|
97 |
+
- kaldi_ark
|
98 |
+
valid_data_path_and_name_and_type:
|
99 |
+
- - dump/22k/raw/dev/text
|
100 |
+
- text
|
101 |
+
- text
|
102 |
+
- - dump/22k/raw/dev/wav.scp
|
103 |
+
- speech
|
104 |
+
- sound
|
105 |
+
- - dump/22k/xvector/dev/xvector.scp
|
106 |
+
- spembs
|
107 |
+
- kaldi_ark
|
108 |
+
allow_variable_data_keys: false
|
109 |
+
max_cache_size: 0.0
|
110 |
+
max_cache_fd: 32
|
111 |
+
valid_max_cache_size: null
|
112 |
+
exclude_weight_decay: false
|
113 |
+
exclude_weight_decay_conf: {}
|
114 |
+
optim: adamw
|
115 |
+
optim_conf:
|
116 |
+
lr: 0.0002
|
117 |
+
betas:
|
118 |
+
- 0.8
|
119 |
+
- 0.99
|
120 |
+
eps: 1.0e-09
|
121 |
+
weight_decay: 0.0
|
122 |
+
scheduler: exponentiallr
|
123 |
+
scheduler_conf:
|
124 |
+
gamma: 0.999875
|
125 |
+
optim2: adamw
|
126 |
+
optim2_conf:
|
127 |
+
lr: 0.0002
|
128 |
+
betas:
|
129 |
+
- 0.8
|
130 |
+
- 0.99
|
131 |
+
eps: 1.0e-09
|
132 |
+
weight_decay: 0.0
|
133 |
+
scheduler2: exponentiallr
|
134 |
+
scheduler2_conf:
|
135 |
+
gamma: 0.999875
|
136 |
+
generator_first: false
|
137 |
+
token_list:
|
138 |
+
- <blank>
|
139 |
+
- <unk>
|
140 |
+
- d
|
141 |
+
- sh
|
142 |
+
- j
|
143 |
+
- i4
|
144 |
+
- zh
|
145 |
+
- l
|
146 |
+
- x
|
147 |
+
- e
|
148 |
+
- b
|
149 |
+
- g
|
150 |
+
- i1
|
151 |
+
- h
|
152 |
+
- q
|
153 |
+
- m
|
154 |
+
- t
|
155 |
+
- i2
|
156 |
+
- u4
|
157 |
+
- z
|
158 |
+
- ch
|
159 |
+
- i3
|
160 |
+
- f
|
161 |
+
- s
|
162 |
+
- n
|
163 |
+
- iou3
|
164 |
+
- r
|
165 |
+
- ian4
|
166 |
+
- ong1
|
167 |
+
- uei4
|
168 |
+
- e4
|
169 |
+
- en2
|
170 |
+
- ai4
|
171 |
+
- k
|
172 |
+
- ing2
|
173 |
+
- a1
|
174 |
+
- uo3
|
175 |
+
- u3
|
176 |
+
- ao4
|
177 |
+
- p
|
178 |
+
- an1
|
179 |
+
- eng2
|
180 |
+
- e2
|
181 |
+
- in1
|
182 |
+
- c
|
183 |
+
- ai2
|
184 |
+
- an4
|
185 |
+
- ian2
|
186 |
+
- u2
|
187 |
+
- ang4
|
188 |
+
- ian1
|
189 |
+
- ai3
|
190 |
+
- ing1
|
191 |
+
- ao3
|
192 |
+
- uo4
|
193 |
+
- ian3
|
194 |
+
- ing4
|
195 |
+
- ü4
|
196 |
+
- ang1
|
197 |
+
- u1
|
198 |
+
- iao4
|
199 |
+
- eng1
|
200 |
+
- iou4
|
201 |
+
- a4
|
202 |
+
- üan2
|
203 |
+
- ie4
|
204 |
+
- ou4
|
205 |
+
- er4
|
206 |
+
- en1
|
207 |
+
- ong2
|
208 |
+
- e1
|
209 |
+
- an3
|
210 |
+
- ei4
|
211 |
+
- uo2
|
212 |
+
- ou3
|
213 |
+
- ang2
|
214 |
+
- iang4
|
215 |
+
- ou1
|
216 |
+
- ang3
|
217 |
+
- an2
|
218 |
+
- eng4
|
219 |
+
- ong4
|
220 |
+
- uan4
|
221 |
+
- a3
|
222 |
+
- ia4
|
223 |
+
- ia1
|
224 |
+
- iao1
|
225 |
+
- iang1
|
226 |
+
- iou2
|
227 |
+
- uo1
|
228 |
+
- ei3
|
229 |
+
- iao3
|
230 |
+
- in4
|
231 |
+
- e3
|
232 |
+
- ü3
|
233 |
+
- iang3
|
234 |
+
- uei2
|
235 |
+
- en3
|
236 |
+
- uan1
|
237 |
+
- ie3
|
238 |
+
- ao1
|
239 |
+
- ai1
|
240 |
+
- üe4
|
241 |
+
- ü2
|
242 |
+
- ing3
|
243 |
+
- en4
|
244 |
+
- uei1
|
245 |
+
- er2
|
246 |
+
- uan3
|
247 |
+
- ü1
|
248 |
+
- in3
|
249 |
+
- en
|
250 |
+
- üe2
|
251 |
+
- ie2
|
252 |
+
- ei2
|
253 |
+
- ua4
|
254 |
+
- uan2
|
255 |
+
- in2
|
256 |
+
- a2
|
257 |
+
- ie1
|
258 |
+
- iang2
|
259 |
+
- ou2
|
260 |
+
- ong3
|
261 |
+
- uang3
|
262 |
+
- eng3
|
263 |
+
- uen1
|
264 |
+
- uai4
|
265 |
+
- ün4
|
266 |
+
- uang4
|
267 |
+
- uei3
|
268 |
+
- uen2
|
269 |
+
- uen4
|
270 |
+
- i
|
271 |
+
- iong4
|
272 |
+
- v3
|
273 |
+
- iao2
|
274 |
+
- üan4
|
275 |
+
- uang1
|
276 |
+
- ei1
|
277 |
+
- o2
|
278 |
+
- iou1
|
279 |
+
- uang2
|
280 |
+
- a
|
281 |
+
- ao2
|
282 |
+
- o1
|
283 |
+
- ua2
|
284 |
+
- uen3
|
285 |
+
- ua1
|
286 |
+
- v4
|
287 |
+
- üan3
|
288 |
+
- ün1
|
289 |
+
- üe1
|
290 |
+
- ün2
|
291 |
+
- o4
|
292 |
+
- er3
|
293 |
+
- iong3
|
294 |
+
- üan1
|
295 |
+
- ia3
|
296 |
+
- ia2
|
297 |
+
- iong1
|
298 |
+
- üe3
|
299 |
+
- ve4
|
300 |
+
- iong2
|
301 |
+
- uai2
|
302 |
+
- er
|
303 |
+
- ua3
|
304 |
+
- uai1
|
305 |
+
- ou
|
306 |
+
- ün3
|
307 |
+
- uai3
|
308 |
+
- ia
|
309 |
+
- uo
|
310 |
+
- o3
|
311 |
+
- v2
|
312 |
+
- ueng1
|
313 |
+
- o
|
314 |
+
- ei
|
315 |
+
- ua
|
316 |
+
- io1
|
317 |
+
- <sos/eos>
|
318 |
+
odim: null
|
319 |
+
model_conf: {}
|
320 |
+
use_preprocessor: true
|
321 |
+
token_type: phn
|
322 |
+
bpemodel: null
|
323 |
+
non_linguistic_symbols: null
|
324 |
+
cleaner: null
|
325 |
+
g2p: pypinyin_g2p_phone
|
326 |
+
feats_extract: linear_spectrogram
|
327 |
+
feats_extract_conf:
|
328 |
+
n_fft: 1024
|
329 |
+
hop_length: 256
|
330 |
+
win_length: null
|
331 |
+
normalize: null
|
332 |
+
normalize_conf: {}
|
333 |
+
tts: vits
|
334 |
+
tts_conf:
|
335 |
+
generator_type: vits_generator
|
336 |
+
generator_params:
|
337 |
+
hidden_channels: 192
|
338 |
+
spks: -1
|
339 |
+
spk_embed_dim: 512
|
340 |
+
global_channels: 256
|
341 |
+
segment_size: 32
|
342 |
+
text_encoder_attention_heads: 2
|
343 |
+
text_encoder_ffn_expand: 4
|
344 |
+
text_encoder_blocks: 6
|
345 |
+
text_encoder_positionwise_layer_type: conv1d
|
346 |
+
text_encoder_positionwise_conv_kernel_size: 3
|
347 |
+
text_encoder_positional_encoding_layer_type: rel_pos
|
348 |
+
text_encoder_self_attention_layer_type: rel_selfattn
|
349 |
+
text_encoder_activation_type: swish
|
350 |
+
text_encoder_normalize_before: true
|
351 |
+
text_encoder_dropout_rate: 0.1
|
352 |
+
text_encoder_positional_dropout_rate: 0.0
|
353 |
+
text_encoder_attention_dropout_rate: 0.1
|
354 |
+
use_macaron_style_in_text_encoder: true
|
355 |
+
use_conformer_conv_in_text_encoder: false
|
356 |
+
text_encoder_conformer_kernel_size: -1
|
357 |
+
decoder_kernel_size: 7
|
358 |
+
decoder_channels: 512
|
359 |
+
decoder_upsample_scales:
|
360 |
+
- 8
|
361 |
+
- 8
|
362 |
+
- 2
|
363 |
+
- 2
|
364 |
+
decoder_upsample_kernel_sizes:
|
365 |
+
- 16
|
366 |
+
- 16
|
367 |
+
- 4
|
368 |
+
- 4
|
369 |
+
decoder_resblock_kernel_sizes:
|
370 |
+
- 3
|
371 |
+
- 7
|
372 |
+
- 11
|
373 |
+
decoder_resblock_dilations:
|
374 |
+
- - 1
|
375 |
+
- 3
|
376 |
+
- 5
|
377 |
+
- - 1
|
378 |
+
- 3
|
379 |
+
- 5
|
380 |
+
- - 1
|
381 |
+
- 3
|
382 |
+
- 5
|
383 |
+
use_weight_norm_in_decoder: true
|
384 |
+
posterior_encoder_kernel_size: 5
|
385 |
+
posterior_encoder_layers: 16
|
386 |
+
posterior_encoder_stacks: 1
|
387 |
+
posterior_encoder_base_dilation: 1
|
388 |
+
posterior_encoder_dropout_rate: 0.0
|
389 |
+
use_weight_norm_in_posterior_encoder: true
|
390 |
+
flow_flows: 4
|
391 |
+
flow_kernel_size: 5
|
392 |
+
flow_base_dilation: 1
|
393 |
+
flow_layers: 4
|
394 |
+
flow_dropout_rate: 0.0
|
395 |
+
use_weight_norm_in_flow: true
|
396 |
+
use_only_mean_in_flow: true
|
397 |
+
stochastic_duration_predictor_kernel_size: 3
|
398 |
+
stochastic_duration_predictor_dropout_rate: 0.5
|
399 |
+
stochastic_duration_predictor_flows: 4
|
400 |
+
stochastic_duration_predictor_dds_conv_layers: 3
|
401 |
+
vocabs: 180
|
402 |
+
aux_channels: 513
|
403 |
+
discriminator_type: hifigan_multi_scale_multi_period_discriminator
|
404 |
+
discriminator_params:
|
405 |
+
scales: 1
|
406 |
+
scale_downsample_pooling: AvgPool1d
|
407 |
+
scale_downsample_pooling_params:
|
408 |
+
kernel_size: 4
|
409 |
+
stride: 2
|
410 |
+
padding: 2
|
411 |
+
scale_discriminator_params:
|
412 |
+
in_channels: 1
|
413 |
+
out_channels: 1
|
414 |
+
kernel_sizes:
|
415 |
+
- 15
|
416 |
+
- 41
|
417 |
+
- 5
|
418 |
+
- 3
|
419 |
+
channels: 128
|
420 |
+
max_downsample_channels: 1024
|
421 |
+
max_groups: 16
|
422 |
+
bias: true
|
423 |
+
downsample_scales:
|
424 |
+
- 2
|
425 |
+
- 2
|
426 |
+
- 4
|
427 |
+
- 4
|
428 |
+
- 1
|
429 |
+
nonlinear_activation: LeakyReLU
|
430 |
+
nonlinear_activation_params:
|
431 |
+
negative_slope: 0.1
|
432 |
+
use_weight_norm: true
|
433 |
+
use_spectral_norm: false
|
434 |
+
follow_official_norm: false
|
435 |
+
periods:
|
436 |
+
- 2
|
437 |
+
- 3
|
438 |
+
- 5
|
439 |
+
- 7
|
440 |
+
- 11
|
441 |
+
period_discriminator_params:
|
442 |
+
in_channels: 1
|
443 |
+
out_channels: 1
|
444 |
+
kernel_sizes:
|
445 |
+
- 5
|
446 |
+
- 3
|
447 |
+
channels: 32
|
448 |
+
downsample_scales:
|
449 |
+
- 3
|
450 |
+
- 3
|
451 |
+
- 3
|
452 |
+
- 3
|
453 |
+
- 1
|
454 |
+
max_downsample_channels: 1024
|
455 |
+
bias: true
|
456 |
+
nonlinear_activation: LeakyReLU
|
457 |
+
nonlinear_activation_params:
|
458 |
+
negative_slope: 0.1
|
459 |
+
use_weight_norm: true
|
460 |
+
use_spectral_norm: false
|
461 |
+
generator_adv_loss_params:
|
462 |
+
average_by_discriminators: false
|
463 |
+
loss_type: mse
|
464 |
+
discriminator_adv_loss_params:
|
465 |
+
average_by_discriminators: false
|
466 |
+
loss_type: mse
|
467 |
+
feat_match_loss_params:
|
468 |
+
average_by_discriminators: false
|
469 |
+
average_by_layers: false
|
470 |
+
include_final_outputs: true
|
471 |
+
mel_loss_params:
|
472 |
+
fs: 22050
|
473 |
+
n_fft: 1024
|
474 |
+
hop_length: 256
|
475 |
+
win_length: null
|
476 |
+
window: hann
|
477 |
+
n_mels: 80
|
478 |
+
fmin: 0
|
479 |
+
fmax: null
|
480 |
+
log_base: null
|
481 |
+
lambda_adv: 1.0
|
482 |
+
lambda_mel: 45.0
|
483 |
+
lambda_feat_match: 2.0
|
484 |
+
lambda_dur: 1.0
|
485 |
+
lambda_kl: 1.0
|
486 |
+
sampling_rate: 22050
|
487 |
+
cache_generator_outputs: true
|
488 |
+
pitch_extract: null
|
489 |
+
pitch_extract_conf: {}
|
490 |
+
pitch_normalize: null
|
491 |
+
pitch_normalize_conf: {}
|
492 |
+
energy_extract: null
|
493 |
+
energy_extract_conf: {}
|
494 |
+
energy_normalize: null
|
495 |
+
energy_normalize_conf: {}
|
496 |
+
required:
|
497 |
+
- output_dir
|
498 |
+
- token_list
|
499 |
+
version: '202301'
|
500 |
+
distributed: false
|
exp/22k/tts_train_raw_phn_pypinyin_g2p_phone/images/discriminator_backward_time.png
ADDED
![]() |
exp/22k/tts_train_raw_phn_pypinyin_g2p_phone/images/discriminator_fake_loss.png
ADDED
![]() |
exp/22k/tts_train_raw_phn_pypinyin_g2p_phone/images/discriminator_forward_time.png
ADDED
![]() |
exp/22k/tts_train_raw_phn_pypinyin_g2p_phone/images/discriminator_loss.png
ADDED
![]() |
exp/22k/tts_train_raw_phn_pypinyin_g2p_phone/images/discriminator_optim_step_time.png
ADDED
![]() |
exp/22k/tts_train_raw_phn_pypinyin_g2p_phone/images/discriminator_real_loss.png
ADDED
![]() |
exp/22k/tts_train_raw_phn_pypinyin_g2p_phone/images/discriminator_train_time.png
ADDED
![]() |
exp/22k/tts_train_raw_phn_pypinyin_g2p_phone/images/generator_adv_loss.png
ADDED
![]() |
exp/22k/tts_train_raw_phn_pypinyin_g2p_phone/images/generator_backward_time.png
ADDED
![]() |
exp/22k/tts_train_raw_phn_pypinyin_g2p_phone/images/generator_dur_loss.png
ADDED
![]() |
exp/22k/tts_train_raw_phn_pypinyin_g2p_phone/images/generator_feat_match_loss.png
ADDED
![]() |
exp/22k/tts_train_raw_phn_pypinyin_g2p_phone/images/generator_forward_time.png
ADDED
![]() |
exp/22k/tts_train_raw_phn_pypinyin_g2p_phone/images/generator_kl_loss.png
ADDED
![]() |
exp/22k/tts_train_raw_phn_pypinyin_g2p_phone/images/generator_loss.png
ADDED
![]() |
exp/22k/tts_train_raw_phn_pypinyin_g2p_phone/images/generator_mel_loss.png
ADDED
![]() |
exp/22k/tts_train_raw_phn_pypinyin_g2p_phone/images/generator_optim_step_time.png
ADDED
![]() |
exp/22k/tts_train_raw_phn_pypinyin_g2p_phone/images/generator_train_time.png
ADDED
![]() |
exp/22k/tts_train_raw_phn_pypinyin_g2p_phone/images/gpu_max_cached_mem_GB.png
ADDED
![]() |
exp/22k/tts_train_raw_phn_pypinyin_g2p_phone/images/iter_time.png
ADDED
![]() |
exp/22k/tts_train_raw_phn_pypinyin_g2p_phone/images/optim0_lr0.png
ADDED
![]() |
exp/22k/tts_train_raw_phn_pypinyin_g2p_phone/images/optim1_lr0.png
ADDED
![]() |
exp/22k/tts_train_raw_phn_pypinyin_g2p_phone/images/train_time.png
ADDED
![]() |
meta.yaml
ADDED
@@ -0,0 +1,8 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
espnet: '202301'
|
2 |
+
files:
|
3 |
+
model_file: exp/22k/tts_train_raw_phn_pypinyin_g2p_phone/600epoch.pth
|
4 |
+
python: "3.9.16 (main, Mar 8 2023, 14:00:05) \n[GCC 11.2.0]"
|
5 |
+
timestamp: 1682826213.359642
|
6 |
+
torch: 1.12.1
|
7 |
+
yaml_files:
|
8 |
+
train_config: exp/22k/tts_train_raw_phn_pypinyin_g2p_phone/config.yaml
|