File size: 2,230 Bytes
bf864fd 960ea7a bf864fd 960ea7a bf864fd 960ea7a bf864fd |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 |
---
language:
- en
license: mit
tags:
- embeddings
- Speaker
- Verification
- Identification
- NAS
- TDNN
- pytorch
datasets:
- voxceleb1
- voxceleb2
metrics:
- EER
- minDCF:
- p_target: 0.01
---
# EfficientTDNN
Model Version are listed as follows.
- **Dynamic Kernel**: The model enables various kernel sizes in {1,3,5}, `kernel/kernel.torchparams`.
- **Dynamic Depth**: The model enables additional various depth in {2,3,4} based on **Dynamic Kernel** version, `depth/depth.torchparams`.
- **Dynamic Width 1**: The model enable additional various width in [0.5, 1.0] based on **Dynamic Depth** version, `width1/width1.torchparams`.
- **Dynamic Width 2**: The model enable additional various width in [0.25, 0.5] based on **Dynamic Width 1** version, `width2/width2.torchparams`.
Furthermore, some subnets are given in the form of the weights of batchnorm corresponding to their trained supernets as follows.
- **Dynamic Kernel**
1. `kernel/kernel.max.bn.tar`
2. `kernel/kernel.Kmin.bn.tar`
- **Dynamic Depth**
1. `depth/depth.max.bn.tar`
2. `depth/depth.Kmin.bn.tar`
3. `depth/depth.Dmin.bn.tar`
4. `depth/depth.3.512.5.5.3.3.1536.bn.tar`
5. `depth/depth.ecapa-tdnn.3.512.512.512.512.5.3.3.3.1536.bn.tar`
- **Dynamic Width 1**
1. `width1/width1.torchparams`
2. `width1/width1.max.bn.tar`
3. `width1/width1.Kmin.bn.tar`
4. `width1/width1.Dmin.bn.tar`
5. `width1/width1.C1min.bn.tar`
6. `width1/width1.3.383.256.256.256.5.3.3.3.768.bn.tar`
- **Dynamic Width 2**
1. `width2/width2.max.bn.tar`
2. `width2/width2.Kmin.bn.tar`
3. `width2/width2.Dmin.bn.tar`
4. `width2/width2.C1min.bn.tar`
5. `width2/width2.C2min.bn.tar`
6. `width2/width2.3.384.3.1152.bn.tar`
7. `width2/width2.3.256.256.384.384.1.3.5.3.1152.bn.tar`
8. `width2/width2.2.256.256.256.3.3.3.400.bn.tar`
The tag is described as follows.
- max: `(4, [512, 512, 512, 512, 512], [5, 5, 5, 5, 5], 1536)`
- Kmin: `(4, [512, 512, 512, 512, 512], [1, 1, 1, 1, 1], 1536)`
- Dmin: `(2, [512, 512, 512], [1, 1, 1], 1536)`
- C1min: `(2, [256, 256, 256], [1, 1, 1], 768)`
- C2min: `(2, [128, 128, 128], [1, 1, 1], 384)`
More details about EfficentTDNN can be found in the paper [EfficientTDNN](https://arxiv.org/abs/2103.13581).
|