metadata
language:
- en
license: mit
tags:
- embeddings
- Speaker
- Verification
- Identification
- NAS
- TDNN
- pytorch
datasets:
- voxceleb1
- voxceleb2
metrics:
- EER
- minDCF:
- p_target: 0.01
EfficientTDNN
Model Version are listed as follows.
- Dynamic Kernel: The model enables various kernel sizes in {1,3,5},
kernel/kernel.torchparams
. - Dynamic Depth: The model enables additional various depth in {2,3,4} based on Dynamic Kernel version,
depth/depth.torchparams
. - Dynamic Width 1: The model enable additional various width in [0.5, 1.0] based on Dynamic Depth version,
width1/width1.torchparams
. - Dynamic Width 2: The model enable additional various width in [0.25, 0.5] based on Dynamic Width 1 version,
width2/width2.torchparams
.
Furthermore, some subnets are given in the form of the weights of batchnorm corresponding to their trained supernets as follows.
- Dynamic Kernel
kernel/kernel.max.bn.tar
kernel/kernel.Kmin.bn.tar
- Dynamic Depth
depth/depth.max.bn.tar
depth/depth.Kmin.bn.tar
depth/depth.Dmin.bn.tar
depth/depth.3.512.5.5.3.3.1536.bn.tar
depth/depth.ecapa-tdnn.3.512.512.512.512.5.3.3.3.1536.bn.tar
- Dynamic Width 1
width1/width1.torchparams
width1/width1.max.bn.tar
width1/width1.Kmin.bn.tar
width1/width1.Dmin.bn.tar
width1/width1.C1min.bn.tar
width1/width1.3.383.256.256.256.5.3.3.3.768.bn.tar
- Dynamic Width 2
width2/width2.max.bn.tar
width2/width2.Kmin.bn.tar
width2/width2.Dmin.bn.tar
width2/width2.C1min.bn.tar
width2/width2.C2min.bn.tar
width2/width2.3.384.3.1152.bn.tar
width2/width2.3.256.256.384.384.1.3.5.3.1152.bn.tar
width2/width2.2.256.256.256.3.3.3.400.bn.tar
The tag is described as follows.
- max:
(4, [512, 512, 512, 512, 512], [5, 5, 5, 5, 5], 1536)
- Kmin:
(4, [512, 512, 512, 512, 512], [1, 1, 1, 1, 1], 1536)
- Dmin:
(2, [512, 512, 512], [1, 1, 1], 1536)
- C1min:
(2, [256, 256, 256], [1, 1, 1], 768)
- C2min:
(2, [128, 128, 128], [1, 1, 1], 384)
More details about EfficentTDNN can be found in the paper EfficientTDNN.