Jenthe commited on
Commit
f8b0f89
1 Parent(s): 1f55acd

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +29 -8
README.md CHANGED
@@ -27,6 +27,16 @@ Or with Conda:
27
  conda install -c conda-forge huggingface_hub
28
  ```
29
 
 
 
 
 
 
 
 
 
 
 
30
  ### Speaker Embedding Extraction
31
 
32
  Extracting speaker embeddings is easy and only requires a few lines of code:
@@ -34,17 +44,28 @@ Extracting speaker embeddings is easy and only requires a few lines of code:
34
  ```python
35
  import torch
36
  import torchaudio
37
- from huggingface_hub import hf_hub_download
38
 
39
- # automatically checks for cached file
40
- model_file = hf_hub_download(repo_id='Jenthe/ECAPA2', filename='model.pt')
41
-
42
- # change map_location to 'cuda' for GPU inference (recommended)
43
  ecapa2_model = torch.jit.load(model_file, map_location='cpu')
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
44
 
45
- # note: input audio should have a sample rate of 16 kHz
46
- audio, sr = torchaudio.load('sample.wav')
47
- embedding = ecapa2_model(audio)
48
  ```
49
 
50
  ### Hierarchical Feature Extraction
 
27
  conda install -c conda-forge huggingface_hub
28
  ```
29
 
30
+ Download model:
31
+
32
+ ```python
33
+ from huggingface_hub import hf_hub_download
34
+
35
+ # automatically checks for cached file
36
+ model_file = hf_hub_download(repo_id='Jenthe/ECAPA2', filename='model.pt')
37
+ ```
38
+
39
+
40
  ### Speaker Embedding Extraction
41
 
42
  Extracting speaker embeddings is easy and only requires a few lines of code:
 
44
  ```python
45
  import torch
46
  import torchaudio
 
47
 
 
 
 
 
48
  ecapa2_model = torch.jit.load(model_file, map_location='cpu')
49
+ audio, sr = torchaudio.load('sample.wav') # sample rate of 16 kHz expected
50
+
51
+ with torch.no_grad():
52
+ embedding = ecapa2_model(audio)
53
+ ```
54
+
55
+ For faster, 16-bit half-precision CUDA inference (recommended):
56
+
57
+ ```python
58
+ import torch
59
+ import torchaudio
60
+
61
+ ecapa2_model = torch.jit.load(model_file, map_location='cuda')
62
+
63
+ ecapa2_model.half() # optional, but results in faster inference
64
+
65
+ audio, sr = torchaudio.load('sample.wav') # sample rate of 16 kHz expected
66
 
67
+ with torch.no_grad():
68
+ embedding = ecapa2_model(audio)
 
69
  ```
70
 
71
  ### Hierarchical Feature Extraction