Siddhant commited on
Commit
1015d84
1 Parent(s): 1076861

import from zenodo

Browse files
Files changed (27) hide show
  1. README.md +50 -0
  2. dump/raw/org/tr_no_dev/spk2sid +109 -0
  3. exp/tts_train_multi_spk_vits_raw_phn_tacotron_g2p_en_no_space/config.yaml +390 -0
  4. exp/tts_train_multi_spk_vits_raw_phn_tacotron_g2p_en_no_space/images/discriminator_backward_time.png +0 -0
  5. exp/tts_train_multi_spk_vits_raw_phn_tacotron_g2p_en_no_space/images/discriminator_fake_loss.png +0 -0
  6. exp/tts_train_multi_spk_vits_raw_phn_tacotron_g2p_en_no_space/images/discriminator_forward_time.png +0 -0
  7. exp/tts_train_multi_spk_vits_raw_phn_tacotron_g2p_en_no_space/images/discriminator_loss.png +0 -0
  8. exp/tts_train_multi_spk_vits_raw_phn_tacotron_g2p_en_no_space/images/discriminator_optim_step_time.png +0 -0
  9. exp/tts_train_multi_spk_vits_raw_phn_tacotron_g2p_en_no_space/images/discriminator_real_loss.png +0 -0
  10. exp/tts_train_multi_spk_vits_raw_phn_tacotron_g2p_en_no_space/images/discriminator_train_time.png +0 -0
  11. exp/tts_train_multi_spk_vits_raw_phn_tacotron_g2p_en_no_space/images/generator_adv_loss.png +0 -0
  12. exp/tts_train_multi_spk_vits_raw_phn_tacotron_g2p_en_no_space/images/generator_backward_time.png +0 -0
  13. exp/tts_train_multi_spk_vits_raw_phn_tacotron_g2p_en_no_space/images/generator_dur_loss.png +0 -0
  14. exp/tts_train_multi_spk_vits_raw_phn_tacotron_g2p_en_no_space/images/generator_feat_match_loss.png +0 -0
  15. exp/tts_train_multi_spk_vits_raw_phn_tacotron_g2p_en_no_space/images/generator_forward_time.png +0 -0
  16. exp/tts_train_multi_spk_vits_raw_phn_tacotron_g2p_en_no_space/images/generator_kl_loss.png +0 -0
  17. exp/tts_train_multi_spk_vits_raw_phn_tacotron_g2p_en_no_space/images/generator_loss.png +0 -0
  18. exp/tts_train_multi_spk_vits_raw_phn_tacotron_g2p_en_no_space/images/generator_mel_loss.png +0 -0
  19. exp/tts_train_multi_spk_vits_raw_phn_tacotron_g2p_en_no_space/images/generator_optim_step_time.png +0 -0
  20. exp/tts_train_multi_spk_vits_raw_phn_tacotron_g2p_en_no_space/images/generator_train_time.png +0 -0
  21. exp/tts_train_multi_spk_vits_raw_phn_tacotron_g2p_en_no_space/images/gpu_max_cached_mem_GB.png +0 -0
  22. exp/tts_train_multi_spk_vits_raw_phn_tacotron_g2p_en_no_space/images/iter_time.png +0 -0
  23. exp/tts_train_multi_spk_vits_raw_phn_tacotron_g2p_en_no_space/images/optim0_lr0.png +0 -0
  24. exp/tts_train_multi_spk_vits_raw_phn_tacotron_g2p_en_no_space/images/optim1_lr0.png +0 -0
  25. exp/tts_train_multi_spk_vits_raw_phn_tacotron_g2p_en_no_space/images/train_time.png +0 -0
  26. exp/tts_train_multi_spk_vits_raw_phn_tacotron_g2p_en_no_space/train.total_count.ave_10best.pth +3 -0
  27. meta.yaml +8 -0
README.md ADDED
@@ -0,0 +1,50 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - espnet
4
+ - audio
5
+ - text-to-speech
6
+ language: en
7
+ datasets:
8
+ - vctk
9
+ license: cc-by-4.0
10
+ ---
11
+ ## ESPnet2 TTS pretrained model
12
+ ### `kan-bayashi/vctk_tts_train_multi_spk_vits_raw_phn_tacotron_g2p_en_no_space_train.total_count.ave`
13
+ ♻️ Imported from https://zenodo.org/record/5500759/
14
+
15
+ This model was trained by kan-bayashi using vctk/tts1 recipe in [espnet](https://github.com/espnet/espnet/).
16
+ ### Demo: How to use in ESPnet2
17
+ ```python
18
+ # coming soon
19
+ ```
20
+ ### Citing ESPnet
21
+ ```BibTex
22
+ @inproceedings{watanabe2018espnet,
23
+ author={Shinji Watanabe and Takaaki Hori and Shigeki Karita and Tomoki Hayashi and Jiro Nishitoba and Yuya Unno and Nelson {Enrique Yalta Soplin} and Jahn Heymann and Matthew Wiesner and Nanxin Chen and Adithya Renduchintala and Tsubasa Ochiai},
24
+ title={{ESPnet}: End-to-End Speech Processing Toolkit},
25
+ year={2018},
26
+ booktitle={Proceedings of Interspeech},
27
+ pages={2207--2211},
28
+ doi={10.21437/Interspeech.2018-1456},
29
+ url={http://dx.doi.org/10.21437/Interspeech.2018-1456}
30
+ }
31
+ @inproceedings{hayashi2020espnet,
32
+ title={{Espnet-TTS}: Unified, reproducible, and integratable open source end-to-end text-to-speech toolkit},
33
+ author={Hayashi, Tomoki and Yamamoto, Ryuichi and Inoue, Katsuki and Yoshimura, Takenori and Watanabe, Shinji and Toda, Tomoki and Takeda, Kazuya and Zhang, Yu and Tan, Xu},
34
+ booktitle={Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
35
+ pages={7654--7658},
36
+ year={2020},
37
+ organization={IEEE}
38
+ }
39
+ ```
40
+ or arXiv:
41
+ ```bibtex
42
+ @misc{watanabe2018espnet,
43
+ title={ESPnet: End-to-End Speech Processing Toolkit},
44
+ author={Shinji Watanabe and Takaaki Hori and Shigeki Karita and Tomoki Hayashi and Jiro Nishitoba and Yuya Unno and Nelson Enrique Yalta Soplin and Jahn Heymann and Matthew Wiesner and Nanxin Chen and Adithya Renduchintala and Tsubasa Ochiai},
45
+ year={2018},
46
+ eprint={1804.00015},
47
+ archivePrefix={arXiv},
48
+ primaryClass={cs.CL}
49
+ }
50
+ ```
dump/raw/org/tr_no_dev/spk2sid ADDED
@@ -0,0 +1,109 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <unk> 0
2
+ p225 1
3
+ p226 2
4
+ p227 3
5
+ p228 4
6
+ p229 5
7
+ p230 6
8
+ p231 7
9
+ p232 8
10
+ p233 9
11
+ p234 10
12
+ p236 11
13
+ p237 12
14
+ p238 13
15
+ p239 14
16
+ p240 15
17
+ p241 16
18
+ p243 17
19
+ p244 18
20
+ p245 19
21
+ p246 20
22
+ p247 21
23
+ p248 22
24
+ p249 23
25
+ p250 24
26
+ p251 25
27
+ p252 26
28
+ p253 27
29
+ p254 28
30
+ p255 29
31
+ p256 30
32
+ p257 31
33
+ p258 32
34
+ p259 33
35
+ p260 34
36
+ p261 35
37
+ p262 36
38
+ p263 37
39
+ p264 38
40
+ p265 39
41
+ p266 40
42
+ p267 41
43
+ p268 42
44
+ p269 43
45
+ p270 44
46
+ p271 45
47
+ p272 46
48
+ p273 47
49
+ p274 48
50
+ p275 49
51
+ p276 50
52
+ p277 51
53
+ p278 52
54
+ p279 53
55
+ p280 54
56
+ p281 55
57
+ p282 56
58
+ p283 57
59
+ p284 58
60
+ p285 59
61
+ p286 60
62
+ p287 61
63
+ p288 62
64
+ p292 63
65
+ p293 64
66
+ p294 65
67
+ p295 66
68
+ p297 67
69
+ p298 68
70
+ p299 69
71
+ p300 70
72
+ p301 71
73
+ p302 72
74
+ p303 73
75
+ p304 74
76
+ p305 75
77
+ p306 76
78
+ p307 77
79
+ p308 78
80
+ p310 79
81
+ p311 80
82
+ p312 81
83
+ p313 82
84
+ p314 83
85
+ p316 84
86
+ p317 85
87
+ p318 86
88
+ p323 87
89
+ p326 88
90
+ p329 89
91
+ p330 90
92
+ p333 91
93
+ p334 92
94
+ p335 93
95
+ p336 94
96
+ p339 95
97
+ p340 96
98
+ p341 97
99
+ p343 98
100
+ p345 99
101
+ p347 100
102
+ p351 101
103
+ p360 102
104
+ p361 103
105
+ p362 104
106
+ p363 105
107
+ p364 106
108
+ p374 107
109
+ p376 108
exp/tts_train_multi_spk_vits_raw_phn_tacotron_g2p_en_no_space/config.yaml ADDED
@@ -0,0 +1,390 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ config: ./conf/tuning/train_multi_spk_vits.yaml
2
+ print_config: false
3
+ log_level: INFO
4
+ dry_run: false
5
+ iterator_type: sequence
6
+ output_dir: exp/tts_train_multi_spk_vits_raw_phn_tacotron_g2p_en_no_space
7
+ ngpu: 1
8
+ seed: 777
9
+ num_workers: 4
10
+ num_att_plot: 3
11
+ dist_backend: nccl
12
+ dist_init_method: env://
13
+ dist_world_size: 4
14
+ dist_rank: 0
15
+ local_rank: 0
16
+ dist_master_addr: localhost
17
+ dist_master_port: 39150
18
+ dist_launcher: null
19
+ multiprocessing_distributed: true
20
+ unused_parameters: true
21
+ sharded_ddp: false
22
+ cudnn_enabled: true
23
+ cudnn_benchmark: false
24
+ cudnn_deterministic: false
25
+ collect_stats: false
26
+ write_collected_feats: false
27
+ max_epoch: 2000
28
+ patience: null
29
+ val_scheduler_criterion:
30
+ - valid
31
+ - loss
32
+ early_stopping_criterion:
33
+ - valid
34
+ - loss
35
+ - min
36
+ best_model_criterion:
37
+ - - train
38
+ - total_count
39
+ - max
40
+ keep_nbest_models: 10
41
+ grad_clip: -1
42
+ grad_clip_type: 2.0
43
+ grad_noise: false
44
+ accum_grad: 1
45
+ no_forward_run: false
46
+ resume: true
47
+ train_dtype: float32
48
+ use_amp: false
49
+ log_interval: 50
50
+ use_tensorboard: true
51
+ use_wandb: false
52
+ wandb_project: null
53
+ wandb_id: null
54
+ wandb_entity: null
55
+ wandb_name: null
56
+ wandb_model_log_interval: -1
57
+ detect_anomaly: false
58
+ pretrain_path: null
59
+ init_param: []
60
+ ignore_init_mismatch: false
61
+ freeze_param: []
62
+ num_iters_per_epoch: 500
63
+ batch_size: 20
64
+ valid_batch_size: null
65
+ batch_bins: 3000000
66
+ valid_batch_bins: null
67
+ train_shape_file:
68
+ - exp/tts_stats_raw_linear_spectrogram_phn_tacotron_g2p_en_no_space/train/text_shape.phn
69
+ - exp/tts_stats_raw_linear_spectrogram_phn_tacotron_g2p_en_no_space/train/speech_shape
70
+ valid_shape_file:
71
+ - exp/tts_stats_raw_linear_spectrogram_phn_tacotron_g2p_en_no_space/valid/text_shape.phn
72
+ - exp/tts_stats_raw_linear_spectrogram_phn_tacotron_g2p_en_no_space/valid/speech_shape
73
+ batch_type: numel
74
+ valid_batch_type: null
75
+ fold_length:
76
+ - 150
77
+ - 204800
78
+ sort_in_batch: descending
79
+ sort_batch: descending
80
+ multiple_iterator: false
81
+ chunk_length: 500
82
+ chunk_shift_ratio: 0.5
83
+ num_cache_chunks: 1024
84
+ train_data_path_and_name_and_type:
85
+ - - dump/raw/tr_no_dev/text
86
+ - text
87
+ - text
88
+ - - dump/raw/tr_no_dev/wav.scp
89
+ - speech
90
+ - sound
91
+ - - dump/raw/tr_no_dev/utt2sid
92
+ - sids
93
+ - text_int
94
+ valid_data_path_and_name_and_type:
95
+ - - dump/raw/dev/text
96
+ - text
97
+ - text
98
+ - - dump/raw/dev/wav.scp
99
+ - speech
100
+ - sound
101
+ - - dump/raw/dev/utt2sid
102
+ - sids
103
+ - text_int
104
+ allow_variable_data_keys: false
105
+ max_cache_size: 0.0
106
+ max_cache_fd: 32
107
+ valid_max_cache_size: null
108
+ optim: adamw
109
+ optim_conf:
110
+ lr: 0.0002
111
+ betas:
112
+ - 0.8
113
+ - 0.99
114
+ eps: 1.0e-09
115
+ weight_decay: 0.0
116
+ scheduler: exponentiallr
117
+ scheduler_conf:
118
+ gamma: 0.999875
119
+ optim2: adamw
120
+ optim2_conf:
121
+ lr: 0.0002
122
+ betas:
123
+ - 0.8
124
+ - 0.99
125
+ eps: 1.0e-09
126
+ weight_decay: 0.0
127
+ scheduler2: exponentiallr
128
+ scheduler2_conf:
129
+ gamma: 0.999875
130
+ generator_first: false
131
+ token_list:
132
+ - <blank>
133
+ - <unk>
134
+ - AH0
135
+ - T
136
+ - N
137
+ - S
138
+ - R
139
+ - IH1
140
+ - D
141
+ - L
142
+ - .
143
+ - Z
144
+ - DH
145
+ - K
146
+ - W
147
+ - M
148
+ - AE1
149
+ - EH1
150
+ - AA1
151
+ - IH0
152
+ - IY1
153
+ - AH1
154
+ - B
155
+ - P
156
+ - V
157
+ - ER0
158
+ - F
159
+ - HH
160
+ - AY1
161
+ - EY1
162
+ - UW1
163
+ - IY0
164
+ - AO1
165
+ - OW1
166
+ - G
167
+ - ','
168
+ - NG
169
+ - SH
170
+ - Y
171
+ - JH
172
+ - AW1
173
+ - UH1
174
+ - TH
175
+ - ER1
176
+ - CH
177
+ - '?'
178
+ - OW0
179
+ - OW2
180
+ - EH2
181
+ - EY2
182
+ - UW0
183
+ - IH2
184
+ - OY1
185
+ - AY2
186
+ - ZH
187
+ - AW2
188
+ - EH0
189
+ - IY2
190
+ - AA2
191
+ - AE0
192
+ - AH2
193
+ - AE2
194
+ - AO0
195
+ - AO2
196
+ - AY0
197
+ - UW2
198
+ - UH2
199
+ - AA0
200
+ - AW0
201
+ - EY0
202
+ - '!'
203
+ - UH0
204
+ - ER2
205
+ - OY2
206
+ - ''''
207
+ - OY0
208
+ - <sos/eos>
209
+ odim: null
210
+ model_conf: {}
211
+ use_preprocessor: true
212
+ token_type: phn
213
+ bpemodel: null
214
+ non_linguistic_symbols: null
215
+ cleaner: tacotron
216
+ g2p: g2p_en_no_space
217
+ feats_extract: linear_spectrogram
218
+ feats_extract_conf:
219
+ n_fft: 1024
220
+ hop_length: 256
221
+ win_length: null
222
+ normalize: null
223
+ normalize_conf: {}
224
+ tts: vits
225
+ tts_conf:
226
+ generator_type: vits_generator
227
+ generator_params:
228
+ hidden_channels: 192
229
+ spks: 128
230
+ global_channels: 256
231
+ segment_size: 32
232
+ text_encoder_attention_heads: 2
233
+ text_encoder_ffn_expand: 4
234
+ text_encoder_blocks: 6
235
+ text_encoder_positionwise_layer_type: conv1d
236
+ text_encoder_positionwise_conv_kernel_size: 3
237
+ text_encoder_positional_encoding_layer_type: rel_pos
238
+ text_encoder_self_attention_layer_type: rel_selfattn
239
+ text_encoder_activation_type: swish
240
+ text_encoder_normalize_before: true
241
+ text_encoder_dropout_rate: 0.1
242
+ text_encoder_positional_dropout_rate: 0.0
243
+ text_encoder_attention_dropout_rate: 0.1
244
+ use_macaron_style_in_text_encoder: true
245
+ use_conformer_conv_in_text_encoder: false
246
+ text_encoder_conformer_kernel_size: -1
247
+ decoder_kernel_size: 7
248
+ decoder_channels: 512
249
+ decoder_upsample_scales:
250
+ - 8
251
+ - 8
252
+ - 2
253
+ - 2
254
+ decoder_upsample_kernel_sizes:
255
+ - 16
256
+ - 16
257
+ - 4
258
+ - 4
259
+ decoder_resblock_kernel_sizes:
260
+ - 3
261
+ - 7
262
+ - 11
263
+ decoder_resblock_dilations:
264
+ - - 1
265
+ - 3
266
+ - 5
267
+ - - 1
268
+ - 3
269
+ - 5
270
+ - - 1
271
+ - 3
272
+ - 5
273
+ use_weight_norm_in_decoder: true
274
+ posterior_encoder_kernel_size: 5
275
+ posterior_encoder_layers: 16
276
+ posterior_encoder_stacks: 1
277
+ posterior_encoder_base_dilation: 1
278
+ posterior_encoder_dropout_rate: 0.0
279
+ use_weight_norm_in_posterior_encoder: true
280
+ flow_flows: 4
281
+ flow_kernel_size: 5
282
+ flow_base_dilation: 1
283
+ flow_layers: 4
284
+ flow_dropout_rate: 0.0
285
+ use_weight_norm_in_flow: true
286
+ use_only_mean_in_flow: true
287
+ stochastic_duration_predictor_kernel_size: 3
288
+ stochastic_duration_predictor_dropout_rate: 0.5
289
+ stochastic_duration_predictor_flows: 4
290
+ stochastic_duration_predictor_dds_conv_layers: 3
291
+ vocabs: 77
292
+ aux_channels: 513
293
+ discriminator_type: hifigan_multi_scale_multi_period_discriminator
294
+ discriminator_params:
295
+ scales: 1
296
+ scale_downsample_pooling: AvgPool1d
297
+ scale_downsample_pooling_params:
298
+ kernel_size: 4
299
+ stride: 2
300
+ padding: 2
301
+ scale_discriminator_params:
302
+ in_channels: 1
303
+ out_channels: 1
304
+ kernel_sizes:
305
+ - 15
306
+ - 41
307
+ - 5
308
+ - 3
309
+ channels: 128
310
+ max_downsample_channels: 1024
311
+ max_groups: 16
312
+ bias: true
313
+ downsample_scales:
314
+ - 2
315
+ - 2
316
+ - 4
317
+ - 4
318
+ - 1
319
+ nonlinear_activation: LeakyReLU
320
+ nonlinear_activation_params:
321
+ negative_slope: 0.1
322
+ use_weight_norm: true
323
+ use_spectral_norm: false
324
+ follow_official_norm: false
325
+ periods:
326
+ - 2
327
+ - 3
328
+ - 5
329
+ - 7
330
+ - 11
331
+ period_discriminator_params:
332
+ in_channels: 1
333
+ out_channels: 1
334
+ kernel_sizes:
335
+ - 5
336
+ - 3
337
+ channels: 32
338
+ downsample_scales:
339
+ - 3
340
+ - 3
341
+ - 3
342
+ - 3
343
+ - 1
344
+ max_downsample_channels: 1024
345
+ bias: true
346
+ nonlinear_activation: LeakyReLU
347
+ nonlinear_activation_params:
348
+ negative_slope: 0.1
349
+ use_weight_norm: true
350
+ use_spectral_norm: false
351
+ generator_adv_loss_params:
352
+ average_by_discriminators: false
353
+ loss_type: mse
354
+ discriminator_adv_loss_params:
355
+ average_by_discriminators: false
356
+ loss_type: mse
357
+ feat_match_loss_params:
358
+ average_by_discriminators: false
359
+ average_by_layers: false
360
+ include_final_outputs: true
361
+ mel_loss_params:
362
+ fs: 22050
363
+ n_fft: 1024
364
+ hop_length: 256
365
+ win_length: null
366
+ window: hann
367
+ n_mels: 80
368
+ fmin: 0
369
+ fmax: null
370
+ log_base: null
371
+ lambda_adv: 1.0
372
+ lambda_mel: 45.0
373
+ lambda_feat_match: 2.0
374
+ lambda_dur: 1.0
375
+ lambda_kl: 1.0
376
+ sampling_rate: 22050
377
+ cache_generator_outputs: true
378
+ pitch_extract: null
379
+ pitch_extract_conf: {}
380
+ pitch_normalize: null
381
+ pitch_normalize_conf: {}
382
+ energy_extract: null
383
+ energy_extract_conf: {}
384
+ energy_normalize: null
385
+ energy_normalize_conf: {}
386
+ required:
387
+ - output_dir
388
+ - token_list
389
+ version: 0.10.3a1
390
+ distributed: true
exp/tts_train_multi_spk_vits_raw_phn_tacotron_g2p_en_no_space/images/discriminator_backward_time.png ADDED
exp/tts_train_multi_spk_vits_raw_phn_tacotron_g2p_en_no_space/images/discriminator_fake_loss.png ADDED
exp/tts_train_multi_spk_vits_raw_phn_tacotron_g2p_en_no_space/images/discriminator_forward_time.png ADDED
exp/tts_train_multi_spk_vits_raw_phn_tacotron_g2p_en_no_space/images/discriminator_loss.png ADDED
exp/tts_train_multi_spk_vits_raw_phn_tacotron_g2p_en_no_space/images/discriminator_optim_step_time.png ADDED
exp/tts_train_multi_spk_vits_raw_phn_tacotron_g2p_en_no_space/images/discriminator_real_loss.png ADDED
exp/tts_train_multi_spk_vits_raw_phn_tacotron_g2p_en_no_space/images/discriminator_train_time.png ADDED
exp/tts_train_multi_spk_vits_raw_phn_tacotron_g2p_en_no_space/images/generator_adv_loss.png ADDED
exp/tts_train_multi_spk_vits_raw_phn_tacotron_g2p_en_no_space/images/generator_backward_time.png ADDED
exp/tts_train_multi_spk_vits_raw_phn_tacotron_g2p_en_no_space/images/generator_dur_loss.png ADDED
exp/tts_train_multi_spk_vits_raw_phn_tacotron_g2p_en_no_space/images/generator_feat_match_loss.png ADDED
exp/tts_train_multi_spk_vits_raw_phn_tacotron_g2p_en_no_space/images/generator_forward_time.png ADDED
exp/tts_train_multi_spk_vits_raw_phn_tacotron_g2p_en_no_space/images/generator_kl_loss.png ADDED
exp/tts_train_multi_spk_vits_raw_phn_tacotron_g2p_en_no_space/images/generator_loss.png ADDED
exp/tts_train_multi_spk_vits_raw_phn_tacotron_g2p_en_no_space/images/generator_mel_loss.png ADDED
exp/tts_train_multi_spk_vits_raw_phn_tacotron_g2p_en_no_space/images/generator_optim_step_time.png ADDED
exp/tts_train_multi_spk_vits_raw_phn_tacotron_g2p_en_no_space/images/generator_train_time.png ADDED
exp/tts_train_multi_spk_vits_raw_phn_tacotron_g2p_en_no_space/images/gpu_max_cached_mem_GB.png ADDED
exp/tts_train_multi_spk_vits_raw_phn_tacotron_g2p_en_no_space/images/iter_time.png ADDED
exp/tts_train_multi_spk_vits_raw_phn_tacotron_g2p_en_no_space/images/optim0_lr0.png ADDED
exp/tts_train_multi_spk_vits_raw_phn_tacotron_g2p_en_no_space/images/optim1_lr0.png ADDED
exp/tts_train_multi_spk_vits_raw_phn_tacotron_g2p_en_no_space/images/train_time.png ADDED
exp/tts_train_multi_spk_vits_raw_phn_tacotron_g2p_en_no_space/train.total_count.ave_10best.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1906649e53770718880562149394be11177af79f8fa59b121168155af763af2c
3
+ size 386076485
meta.yaml ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ espnet: 0.10.3a2
2
+ files:
3
+ model_file: exp/tts_train_multi_spk_vits_raw_phn_tacotron_g2p_en_no_space/train.total_count.ave_10best.pth
4
+ python: "3.7.3 (default, Mar 27 2019, 22:11:17) \n[GCC 7.3.0]"
5
+ timestamp: 1631321259.887765
6
+ torch: 1.7.1
7
+ yaml_files:
8
+ train_config: exp/tts_train_multi_spk_vits_raw_phn_tacotron_g2p_en_no_space/config.yaml