File size: 746 Bytes
f788b80
e69ff06
f788b80
e69ff06
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
---
license: mit
---
# Model description

This is EnCodecMAE, an audio feature extractor pretrained with masked language modelling to predict discrete targets generated by EnCodec, a neural audio codec. 
For more details about the architecture and pretraining procedure, read the [paper](https://arxiv.org/abs/2309.07391).

# Usage

### 1) Clone the [EnCodecMAE library](https://github.com/habla-liaa/encodecmae):
```
git clone https://github.com/habla-liaa/encodecmae.git
```

### 2) Install it:

```
cd encodecmae
pip install -e .
```

### 3) Extract embeddings in Python:

``` python
from encodecmae import load_model

model = load_model('base', device='cuda:0')
features = model.extract_features_from_file('gsc/bed/00176480_nohash_0.wav')
```