Jenthe commited on
Commit
578fe9e
1 Parent(s): f59e715

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -0
README.md CHANGED
@@ -62,6 +62,15 @@ feature = ecapa2_model(audio, label='embedding|gfe_1|pool')
62
 
63
  The following table describes the available features:
64
 
 
 
 
 
 
 
 
 
 
65
  | Feature Type| Description | Usage | Labels |
66
  | ----------- | ----------- | ----------- | ----------- |
67
  | Local Feature | Non-uniform effective receptive field in the frequency dimension of each frame-level feature.| Abstract features, probably usefull in tasks less related to speaker characteristics. | lfe1, lfe2, lfe3, lfe4
 
62
 
63
  The following table describes the available features:
64
 
65
+ | Feature ID| Description | Usage | Labels |
66
+ | ----------- | ----------- | ----------- | ----------- |
67
+ | gfe_1, gfe_2 | Mean and variance of frame-level features as indicated in Figure 1, extracted before ReLU and BatchNorm layer.| Furthest from speaker embedding, probably usefull in tasks less related to speaker characteristics.
68
+ | pool | Pooled statistics (mean and variance) before the bottleneck speaker embedding layer, extracted before ReLU layer.| Generally capture intra-speaker variance better then speaker embeddings. E.g. speaker profiling, emotion recognition.
69
+ | attention | Same as the pooled statistics but with the attention weights applied.| Generally capture intra-speaker variance better then speaker embeddings. E.g. speaker profiling, emotion recognition.
70
+ | embedding | The standard ECAPA2 speaker embedding. | Best for tasks directly depending on the speaker identity (as opposed to speaker characteristics). E.g. speaker verification, speaker diarization.
71
+
72
+ The following table describes the available features:
73
+
74
  | Feature Type| Description | Usage | Labels |
75
  | ----------- | ----------- | ----------- | ----------- |
76
  | Local Feature | Non-uniform effective receptive field in the frequency dimension of each frame-level feature.| Abstract features, probably usefull in tasks less related to speaker characteristics. | lfe1, lfe2, lfe3, lfe4