File size: 746 Bytes
f788b80 e69ff06 f788b80 e69ff06 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 |
---
license: mit
---
# Model description
This is EnCodecMAE, an audio feature extractor pretrained with masked language modelling to predict discrete targets generated by EnCodec, a neural audio codec.
For more details about the architecture and pretraining procedure, read the [paper](https://arxiv.org/abs/2309.07391).
# Usage
### 1) Clone the [EnCodecMAE library](https://github.com/habla-liaa/encodecmae):
```
git clone https://github.com/habla-liaa/encodecmae.git
```
### 2) Install it:
```
cd encodecmae
pip install -e .
```
### 3) Extract embeddings in Python:
``` python
from encodecmae import load_model
model = load_model('base', device='cuda:0')
features = model.extract_features_from_file('gsc/bed/00176480_nohash_0.wav')
``` |