JunzheJosephZhu commited on
Commit
99e9b7b
1 Parent(s): 7093ec8
Files changed (1) hide show
  1. README.md +65 -0
README.md ADDED
@@ -0,0 +1,65 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - asteroid
4
+ - audio
5
+ - MultiDecoderDPRNN
6
+ datasets:
7
+ - Wsj0MixVar
8
+ - sep_clean
9
+ inference: false
10
+ ---
11
+ ## Asteroid model
12
+
13
+ ## Description:
14
+ Refer to paper "Multi-Decoder DPRNN: High Accuracy Source Counting and Separation",
15
+ Junzhe Zhu, Raymond Yeh, Mark Hasegawa-Johnson. https://arxiv.org/abs/2011.12022
16
+ Demo Page: https://junzhejosephzhu.github.io/Multi-Decoder-DPRNN/
17
+ Original research repo is at https://github.com/JunzheJosephZhu/MultiDecoder-DPRNN
18
+
19
+ This model was trained by Joseph Zhu using the wsj0-mix-var/Multi-Decoder-DPRNN recipe in Asteroid.
20
+ It was trained on the `sep_clean` task of the Wsj0MixVar dataset.
21
+
22
+ ## Training config:
23
+ ```yaml
24
+ filterbank:
25
+ n_filters: 64
26
+ kernel_size: 8
27
+ stride: 4
28
+ masknet:
29
+ n_srcs: [2, 3, 4, 5]
30
+ bn_chan: 128
31
+ hid_size: 128
32
+ chunk_size: 128
33
+ hop_size: 64
34
+ n_repeats: 8
35
+ mask_act: 'sigmoid'
36
+ bidirectional: true
37
+ dropout: 0
38
+ use_mulcat: false
39
+ training:
40
+ epochs: 200
41
+ batch_size: 2
42
+ num_workers: 2
43
+ half_lr: yes
44
+ lr_decay: yes
45
+ early_stop: yes
46
+ gradient_clipping: 5
47
+ optim:
48
+ optimizer: adam
49
+ lr: 0.001
50
+ weight_decay: 0.00000
51
+ data:
52
+ train_dir: "data/{}speakers/wav8k/min/tr"
53
+ valid_dir: "data/{}speakers/wav8k/min/cv"
54
+ task: sep_clean
55
+ sample_rate: 8000
56
+ seglen: 4.0
57
+ minlen: 2.0
58
+ loss:
59
+ lambda: 0.05
60
+ ```
61
+
62
+ ## Results:
63
+ ```yaml
64
+ tmux attach -t 2
65
+ ```