--- license: mit --- # Model description This is EnCodecMAE, an audio feature extractor pretrained with masked language modelling to predict discrete targets generated by EnCodec, a neural audio codec. For more details about the architecture and pretraining procedure, read the [paper](https://arxiv.org/abs/2309.07391). # Usage ### 1) Clone the [EnCodecMAE library](https://github.com/habla-liaa/encodecmae): ``` git clone https://github.com/habla-liaa/encodecmae.git ``` ### 2) Install it: ``` cd encodecmae pip install -e . ``` ### 3) Extract embeddings in Python: ``` python from encodecmae import load_model model = load_model('base', device='cuda:0') features = model.extract_features_from_file('gsc/bed/00176480_nohash_0.wav') ```