jungjee commited on
Commit
9c08333
1 Parent(s): 3b54db8

Update model

Browse files
Files changed (23) hide show
  1. README.md +291 -0
  2. exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_wavlm_jt_raw_sp/16epoch.pth +3 -0
  3. exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_wavlm_jt_raw_sp/RESULTS.md +17 -0
  4. exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_wavlm_jt_raw_sp/config.yaml +200 -0
  5. exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_wavlm_jt_raw_sp/images/backward_time.png +0 -0
  6. exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_wavlm_jt_raw_sp/images/clip.png +0 -0
  7. exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_wavlm_jt_raw_sp/images/eer.png +0 -0
  8. exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_wavlm_jt_raw_sp/images/forward_time.png +0 -0
  9. exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_wavlm_jt_raw_sp/images/gpu_max_cached_mem_GB.png +0 -0
  10. exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_wavlm_jt_raw_sp/images/grad_norm.png +0 -0
  11. exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_wavlm_jt_raw_sp/images/iter_time.png +0 -0
  12. exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_wavlm_jt_raw_sp/images/loss.png +0 -0
  13. exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_wavlm_jt_raw_sp/images/loss_scale.png +0 -0
  14. exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_wavlm_jt_raw_sp/images/mindcf.png +0 -0
  15. exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_wavlm_jt_raw_sp/images/n_trials.png +0 -0
  16. exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_wavlm_jt_raw_sp/images/nontrg_mean.png +0 -0
  17. exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_wavlm_jt_raw_sp/images/nontrg_std.png +0 -0
  18. exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_wavlm_jt_raw_sp/images/optim0_lr0.png +0 -0
  19. exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_wavlm_jt_raw_sp/images/optim_step_time.png +0 -0
  20. exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_wavlm_jt_raw_sp/images/train_time.png +0 -0
  21. exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_wavlm_jt_raw_sp/images/trg_mean.png +0 -0
  22. exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_wavlm_jt_raw_sp/images/trg_std.png +0 -0
  23. meta.yaml +8 -0
README.md ADDED
@@ -0,0 +1,291 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - espnet
4
+ - audio
5
+ - speaker-recognition
6
+ language: multilingual
7
+ datasets:
8
+ - voxceleb
9
+ license: cc-by-4.0
10
+ ---
11
+
12
+ ## ESPnet2 SPK model
13
+
14
+ ### `espnet/voxcelebs12_ska_wavlm_joint`
15
+
16
+ This model was trained by Jungjee using voxceleb recipe in [espnet](https://github.com/espnet/espnet/).
17
+
18
+ ### Demo: How to use in ESPnet2
19
+
20
+ Follow the [ESPnet installation instructions](https://espnet.github.io/espnet/installation.html)
21
+ if you haven't done that already.
22
+
23
+ ```bash
24
+ cd espnet
25
+ git checkout 7f74ef2807eb9d800491c805f455d5c62a195c53
26
+ pip install -e .
27
+ cd egs2/voxceleb/spk1
28
+ ./run.sh --skip_data_prep false --skip_train true --download_model espnet/voxcelebs12_ska_wavlm_joint
29
+ ```
30
+
31
+ <!-- Generated by scripts/utils/show_spk_result.py -->
32
+ # RESULTS
33
+ ## Environments
34
+ date: 2024-01-08 21:12:03.149830
35
+
36
+ - python version: 3.9.16 (main, Mar 8 2023, 14:00:05) [GCC 11.2.0]
37
+ - espnet version: 202310
38
+ - pytorch version: 1.13.1
39
+
40
+ | | Mean | Std |
41
+ |---|---|---|
42
+ | Target | -0.7420 | 0.1281 |
43
+ | Non-target | 0.0822 | 0.0822 |
44
+
45
+ | Model name | EER(%) | minDCF |
46
+ |---|---|---|
47
+ | conf/tuning/train_ska_Vox12_emb192_torchmelspec_subcentertopk_wavlm_jt | 0.516 | 0.05583 |
48
+
49
+ ## SPK config
50
+
51
+ <details><summary>expand</summary>
52
+
53
+ ```
54
+ config: conf/tuning/train_ska_Vox12_emb192_torchmelspec_subcentertopk_wavlm_jt.yaml
55
+ print_config: false
56
+ log_level: INFO
57
+ drop_last_iter: true
58
+ dry_run: false
59
+ iterator_type: category
60
+ valid_iterator_type: sequence
61
+ output_dir: exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_wavlm_jt_raw_sp
62
+ ngpu: 1
63
+ seed: 0
64
+ num_workers: 8
65
+ num_att_plot: 0
66
+ dist_backend: nccl
67
+ dist_init_method: env://
68
+ dist_world_size: 4
69
+ dist_rank: 0
70
+ local_rank: 0
71
+ dist_master_addr: localhost
72
+ dist_master_port: 52613
73
+ dist_launcher: null
74
+ multiprocessing_distributed: true
75
+ unused_parameters: true
76
+ sharded_ddp: false
77
+ cudnn_enabled: true
78
+ cudnn_benchmark: true
79
+ cudnn_deterministic: false
80
+ collect_stats: false
81
+ write_collected_feats: false
82
+ max_epoch: 20
83
+ patience: null
84
+ val_scheduler_criterion:
85
+ - valid
86
+ - loss
87
+ early_stopping_criterion:
88
+ - valid
89
+ - loss
90
+ - min
91
+ best_model_criterion:
92
+ - - valid
93
+ - eer
94
+ - min
95
+ keep_nbest_models: 2
96
+ nbest_averaging_interval: 0
97
+ grad_clip: 9999
98
+ grad_clip_type: 2.0
99
+ grad_noise: false
100
+ accum_grad: 64
101
+ no_forward_run: false
102
+ resume: true
103
+ train_dtype: float32
104
+ use_amp: true
105
+ log_interval: 100
106
+ use_matplotlib: true
107
+ use_tensorboard: true
108
+ create_graph_in_tensorboard: false
109
+ use_wandb: false
110
+ wandb_project: null
111
+ wandb_id: null
112
+ wandb_entity: null
113
+ wandb_name: null
114
+ wandb_model_log_interval: -1
115
+ detect_anomaly: false
116
+ use_lora: false
117
+ save_lora_only: true
118
+ lora_conf: {}
119
+ pretrain_path: null
120
+ init_param:
121
+ - exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_wavlm_nodownsample_raw_sp/valid.eer.best.pth
122
+ ignore_init_mismatch: false
123
+ freeze_param: []
124
+ num_iters_per_epoch: 32000
125
+ batch_size: 16
126
+ valid_batch_size: 5
127
+ batch_bins: 1000000
128
+ valid_batch_bins: null
129
+ train_shape_file:
130
+ - exp/spk_stats_16k_sp/train/speech_shape
131
+ valid_shape_file:
132
+ - exp/spk_stats_16k_sp/valid/speech_shape
133
+ batch_type: folded
134
+ valid_batch_type: null
135
+ fold_length:
136
+ - 120000
137
+ sort_in_batch: descending
138
+ shuffle_within_batch: false
139
+ sort_batch: descending
140
+ multiple_iterator: false
141
+ chunk_length: 500
142
+ chunk_shift_ratio: 0.5
143
+ num_cache_chunks: 1024
144
+ chunk_excluded_key_prefixes: []
145
+ chunk_default_fs: null
146
+ train_data_path_and_name_and_type:
147
+ - - dump/raw/voxceleb12_devs_sp/wav.scp
148
+ - speech
149
+ - sound
150
+ - - dump/raw/voxceleb12_devs_sp/utt2spk
151
+ - spk_labels
152
+ - text
153
+ valid_data_path_and_name_and_type:
154
+ - - dump/raw/voxceleb1_test/trial.scp
155
+ - speech
156
+ - sound
157
+ - - dump/raw/voxceleb1_test/trial2.scp
158
+ - speech2
159
+ - sound
160
+ - - dump/raw/voxceleb1_test/trial_label
161
+ - spk_labels
162
+ - text
163
+ allow_variable_data_keys: false
164
+ max_cache_size: 0.0
165
+ max_cache_fd: 32
166
+ allow_multi_rates: false
167
+ valid_max_cache_size: null
168
+ exclude_weight_decay: false
169
+ exclude_weight_decay_conf: {}
170
+ optim: adam
171
+ optim_conf:
172
+ lr: 0.0001
173
+ weight_decay: 1.0e-05
174
+ amsgrad: false
175
+ scheduler: cosineannealingwarmuprestarts
176
+ scheduler_conf:
177
+ first_cycle_steps: 10000
178
+ cycle_mult: 1.0
179
+ max_lr: 5.0e-05
180
+ min_lr: 5.0e-06
181
+ warmup_steps: 1000
182
+ gamma: 0.75
183
+ init: null
184
+ use_preprocessor: true
185
+ input_size: null
186
+ target_duration: 3.0
187
+ spk2utt: dump/raw/voxceleb12_devs_sp/spk2utt
188
+ spk_num: 21615
189
+ sample_rate: 16000
190
+ num_eval: 10
191
+ rir_scp: ''
192
+ model_conf:
193
+ extract_feats_in_collect_stats: false
194
+ frontend: s3prl
195
+ frontend_conf:
196
+ frontend_conf:
197
+ upstream: wavlm_large
198
+ download_dir: ./hub
199
+ multilayer_feature: true
200
+ specaug: null
201
+ specaug_conf: {}
202
+ normalize: utterance_mvn
203
+ normalize_conf:
204
+ norm_vars: false
205
+ encoder: ska_tdnn
206
+ encoder_conf:
207
+ model_scale: 8
208
+ ndim: 1024
209
+ ska_dim: 128
210
+ output_size: 1536
211
+ pooling: chn_attn_stat
212
+ pooling_conf: {}
213
+ projector: ska_tdnn
214
+ projector_conf:
215
+ output_size: 192
216
+ preprocessor: spk
217
+ preprocessor_conf:
218
+ target_duration: 6.0
219
+ sample_rate: 16000
220
+ num_eval: 3
221
+ noise_apply_prob: 0.0
222
+ noise_info:
223
+ - - 1.0
224
+ - dump/raw/musan_speech.scp
225
+ - - 4
226
+ - 7
227
+ - - 13
228
+ - 20
229
+ - - 1.0
230
+ - dump/raw/musan_noise.scp
231
+ - - 1
232
+ - 1
233
+ - - 0
234
+ - 15
235
+ - - 1.0
236
+ - dump/raw/musan_music.scp
237
+ - - 1
238
+ - 1
239
+ - - 5
240
+ - 15
241
+ rir_apply_prob: 0.0
242
+ rir_scp: dump/raw/rirs.scp
243
+ loss: aamsoftmax_sc_topk
244
+ loss_conf:
245
+ margin: 0.5
246
+ scale: 30
247
+ K: 3
248
+ mp: 0.06
249
+ k_top: 5
250
+ required:
251
+ - output_dir
252
+ version: '202310'
253
+ distributed: true
254
+ ```
255
+
256
+ </details>
257
+
258
+
259
+
260
+ ### Citing ESPnet
261
+
262
+ ```BibTex
263
+ @inproceedings{watanabe2018espnet,
264
+ author={Shinji Watanabe and Takaaki Hori and Shigeki Karita and Tomoki Hayashi and Jiro Nishitoba and Yuya Unno and Nelson Yalta and Jahn Heymann and Matthew Wiesner and Nanxin Chen and Adithya Renduchintala and Tsubasa Ochiai},
265
+ title={{ESPnet}: End-to-End Speech Processing Toolkit},
266
+ year={2018},
267
+ booktitle={Proceedings of Interspeech},
268
+ pages={2207--2211},
269
+ doi={10.21437/Interspeech.2018-1456},
270
+ url={http://dx.doi.org/10.21437/Interspeech.2018-1456}
271
+ }
272
+
273
+
274
+
275
+
276
+
277
+
278
+ ```
279
+
280
+ or arXiv:
281
+
282
+ ```bibtex
283
+ @misc{watanabe2018espnet,
284
+ title={ESPnet: End-to-End Speech Processing Toolkit},
285
+ author={Shinji Watanabe and Takaaki Hori and Shigeki Karita and Tomoki Hayashi and Jiro Nishitoba and Yuya Unno and Nelson Yalta and Jahn Heymann and Matthew Wiesner and Nanxin Chen and Adithya Renduchintala and Tsubasa Ochiai},
286
+ year={2018},
287
+ eprint={1804.00015},
288
+ archivePrefix={arXiv},
289
+ primaryClass={cs.CL}
290
+ }
291
+ ```
exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_wavlm_jt_raw_sp/16epoch.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:635042aa11a51934178a671607a20e33d9f6086bd7467fc752ed2ed94184e794
3
+ size 2074845418
exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_wavlm_jt_raw_sp/RESULTS.md ADDED
@@ -0,0 +1,17 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <!-- Generated by scripts/utils/show_spk_result.py -->
2
+ # RESULTS
3
+ ## Environments
4
+ date: 2024-01-08 21:12:03.149830
5
+
6
+ - python version: 3.9.16 (main, Mar 8 2023, 14:00:05) [GCC 11.2.0]
7
+ - espnet version: 202310
8
+ - pytorch version: 1.13.1
9
+
10
+ | | Mean | Std |
11
+ |---|---|---|
12
+ | Target | -0.7420 | 0.1281 |
13
+ | Non-target | 0.0822 | 0.0822 |
14
+
15
+ | Model name | EER(%) | minDCF |
16
+ |---|---|---|
17
+ | conf/tuning/train_ska_Vox12_emb192_torchmelspec_subcentertopk_wavlm_jt | 0.516 | 0.05583 |
exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_wavlm_jt_raw_sp/config.yaml ADDED
@@ -0,0 +1,200 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ config: conf/tuning/train_ska_Vox12_emb192_torchmelspec_subcentertopk_wavlm_jt.yaml
2
+ print_config: false
3
+ log_level: INFO
4
+ drop_last_iter: true
5
+ dry_run: false
6
+ iterator_type: category
7
+ valid_iterator_type: sequence
8
+ output_dir: exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_wavlm_jt_raw_sp
9
+ ngpu: 1
10
+ seed: 0
11
+ num_workers: 8
12
+ num_att_plot: 0
13
+ dist_backend: nccl
14
+ dist_init_method: env://
15
+ dist_world_size: 4
16
+ dist_rank: 0
17
+ local_rank: 0
18
+ dist_master_addr: localhost
19
+ dist_master_port: 52613
20
+ dist_launcher: null
21
+ multiprocessing_distributed: true
22
+ unused_parameters: true
23
+ sharded_ddp: false
24
+ cudnn_enabled: true
25
+ cudnn_benchmark: true
26
+ cudnn_deterministic: false
27
+ collect_stats: false
28
+ write_collected_feats: false
29
+ max_epoch: 20
30
+ patience: null
31
+ val_scheduler_criterion:
32
+ - valid
33
+ - loss
34
+ early_stopping_criterion:
35
+ - valid
36
+ - loss
37
+ - min
38
+ best_model_criterion:
39
+ - - valid
40
+ - eer
41
+ - min
42
+ keep_nbest_models: 2
43
+ nbest_averaging_interval: 0
44
+ grad_clip: 9999
45
+ grad_clip_type: 2.0
46
+ grad_noise: false
47
+ accum_grad: 64
48
+ no_forward_run: false
49
+ resume: true
50
+ train_dtype: float32
51
+ use_amp: true
52
+ log_interval: 100
53
+ use_matplotlib: true
54
+ use_tensorboard: true
55
+ create_graph_in_tensorboard: false
56
+ use_wandb: false
57
+ wandb_project: null
58
+ wandb_id: null
59
+ wandb_entity: null
60
+ wandb_name: null
61
+ wandb_model_log_interval: -1
62
+ detect_anomaly: false
63
+ use_lora: false
64
+ save_lora_only: true
65
+ lora_conf: {}
66
+ pretrain_path: null
67
+ init_param:
68
+ - exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_wavlm_nodownsample_raw_sp/valid.eer.best.pth
69
+ ignore_init_mismatch: false
70
+ freeze_param: []
71
+ num_iters_per_epoch: 32000
72
+ batch_size: 16
73
+ valid_batch_size: 5
74
+ batch_bins: 1000000
75
+ valid_batch_bins: null
76
+ train_shape_file:
77
+ - exp/spk_stats_16k_sp/train/speech_shape
78
+ valid_shape_file:
79
+ - exp/spk_stats_16k_sp/valid/speech_shape
80
+ batch_type: folded
81
+ valid_batch_type: null
82
+ fold_length:
83
+ - 120000
84
+ sort_in_batch: descending
85
+ shuffle_within_batch: false
86
+ sort_batch: descending
87
+ multiple_iterator: false
88
+ chunk_length: 500
89
+ chunk_shift_ratio: 0.5
90
+ num_cache_chunks: 1024
91
+ chunk_excluded_key_prefixes: []
92
+ chunk_default_fs: null
93
+ train_data_path_and_name_and_type:
94
+ - - dump/raw/voxceleb12_devs_sp/wav.scp
95
+ - speech
96
+ - sound
97
+ - - dump/raw/voxceleb12_devs_sp/utt2spk
98
+ - spk_labels
99
+ - text
100
+ valid_data_path_and_name_and_type:
101
+ - - dump/raw/voxceleb1_test/trial.scp
102
+ - speech
103
+ - sound
104
+ - - dump/raw/voxceleb1_test/trial2.scp
105
+ - speech2
106
+ - sound
107
+ - - dump/raw/voxceleb1_test/trial_label
108
+ - spk_labels
109
+ - text
110
+ allow_variable_data_keys: false
111
+ max_cache_size: 0.0
112
+ max_cache_fd: 32
113
+ allow_multi_rates: false
114
+ valid_max_cache_size: null
115
+ exclude_weight_decay: false
116
+ exclude_weight_decay_conf: {}
117
+ optim: adam
118
+ optim_conf:
119
+ lr: 0.0001
120
+ weight_decay: 1.0e-05
121
+ amsgrad: false
122
+ scheduler: cosineannealingwarmuprestarts
123
+ scheduler_conf:
124
+ first_cycle_steps: 10000
125
+ cycle_mult: 1.0
126
+ max_lr: 5.0e-05
127
+ min_lr: 5.0e-06
128
+ warmup_steps: 1000
129
+ gamma: 0.75
130
+ init: null
131
+ use_preprocessor: true
132
+ input_size: null
133
+ target_duration: 3.0
134
+ spk2utt: dump/raw/voxceleb12_devs_sp/spk2utt
135
+ spk_num: 21615
136
+ sample_rate: 16000
137
+ num_eval: 10
138
+ rir_scp: ''
139
+ model_conf:
140
+ extract_feats_in_collect_stats: false
141
+ frontend: s3prl
142
+ frontend_conf:
143
+ frontend_conf:
144
+ upstream: wavlm_large
145
+ download_dir: ./hub
146
+ multilayer_feature: true
147
+ specaug: null
148
+ specaug_conf: {}
149
+ normalize: utterance_mvn
150
+ normalize_conf:
151
+ norm_vars: false
152
+ encoder: ska_tdnn
153
+ encoder_conf:
154
+ model_scale: 8
155
+ ndim: 1024
156
+ ska_dim: 128
157
+ output_size: 1536
158
+ pooling: chn_attn_stat
159
+ pooling_conf: {}
160
+ projector: ska_tdnn
161
+ projector_conf:
162
+ output_size: 192
163
+ preprocessor: spk
164
+ preprocessor_conf:
165
+ target_duration: 6.0
166
+ sample_rate: 16000
167
+ num_eval: 3
168
+ noise_apply_prob: 0.0
169
+ noise_info:
170
+ - - 1.0
171
+ - dump/raw/musan_speech.scp
172
+ - - 4
173
+ - 7
174
+ - - 13
175
+ - 20
176
+ - - 1.0
177
+ - dump/raw/musan_noise.scp
178
+ - - 1
179
+ - 1
180
+ - - 0
181
+ - 15
182
+ - - 1.0
183
+ - dump/raw/musan_music.scp
184
+ - - 1
185
+ - 1
186
+ - - 5
187
+ - 15
188
+ rir_apply_prob: 0.0
189
+ rir_scp: dump/raw/rirs.scp
190
+ loss: aamsoftmax_sc_topk
191
+ loss_conf:
192
+ margin: 0.5
193
+ scale: 30
194
+ K: 3
195
+ mp: 0.06
196
+ k_top: 5
197
+ required:
198
+ - output_dir
199
+ version: '202310'
200
+ distributed: true
exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_wavlm_jt_raw_sp/images/backward_time.png ADDED
exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_wavlm_jt_raw_sp/images/clip.png ADDED
exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_wavlm_jt_raw_sp/images/eer.png ADDED
exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_wavlm_jt_raw_sp/images/forward_time.png ADDED
exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_wavlm_jt_raw_sp/images/gpu_max_cached_mem_GB.png ADDED
exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_wavlm_jt_raw_sp/images/grad_norm.png ADDED
exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_wavlm_jt_raw_sp/images/iter_time.png ADDED
exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_wavlm_jt_raw_sp/images/loss.png ADDED
exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_wavlm_jt_raw_sp/images/loss_scale.png ADDED
exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_wavlm_jt_raw_sp/images/mindcf.png ADDED
exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_wavlm_jt_raw_sp/images/n_trials.png ADDED
exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_wavlm_jt_raw_sp/images/nontrg_mean.png ADDED
exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_wavlm_jt_raw_sp/images/nontrg_std.png ADDED
exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_wavlm_jt_raw_sp/images/optim0_lr0.png ADDED
exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_wavlm_jt_raw_sp/images/optim_step_time.png ADDED
exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_wavlm_jt_raw_sp/images/train_time.png ADDED
exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_wavlm_jt_raw_sp/images/trg_mean.png ADDED
exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_wavlm_jt_raw_sp/images/trg_std.png ADDED
meta.yaml ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ espnet: '202310'
2
+ files:
3
+ model_file: exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_wavlm_jt_raw_sp/16epoch.pth
4
+ python: "3.9.16 (main, Mar 8 2023, 14:00:05) \n[GCC 11.2.0]"
5
+ timestamp: 1705537531.895315
6
+ torch: 1.13.1
7
+ yaml_files:
8
+ train_config: exp/spk_train_ska_Vox12_emb192_torchmelspec_subcentertopk_wavlm_jt_raw_sp/config.yaml