MAGDA Sample Tagger
ONNX exports of LAION's CLAP HTSAT-unfused model plus the RoBERTa tokenizer, packaged for the MAGDA DAW's sample library (issue #768).
What's in this repo
| File | Size | SHA-256 |
|---|---|---|
clap_audio.onnx |
111.8 MB | 3f42f71e555b62709910b6efa66fa5879f00d9571874b12b0fa674f82dbfe332 |
clap_text.onnx |
478.2 MB | c07b27204836877d5b615c103685b66ea8f21bc6b5b70a572be356125423a8bf |
tokenizer.json |
3.4 MB | 4fd1d86b4f5b53f40a609fcd11c1f34024b735f870a07439d70202b98493661a |
clap_audio.onnxโ audio encoder. Takes a mono 48 kHz waveform, produces a 512-d normalised embedding suitable for cosine similarity search.clap_text.onnxโ text encoder. Takes RoBERTa token ids + attention mask, produces a 512-d normalised embedding in the same space as the audio encoder so a text query can rank audio files by similarity.tokenizer.jsonโ the RoBERTa BPE tokenizer that pairs with the text encoder. MAGDA's C++ tokenizer reads this file directly.
How MAGDA uses these
MAGDA's media database (a SQLite catalogue of audio samples) uses these encoders to:
- Compute an embedding per indexed sample at index time, stored in the
media_embeddingtable. - Encode the user's free-text search query at query time and rank samples by cosine similarity to the query embedding.
Without these models MAGDA falls back to filename / tag full-text search โ still useful, just no semantic similarity.
Export procedure
ONNX exports are generated from laion/clap-htsat-unfused via the
export script in MAGDA's prototype:
prototypes/media_db/src/media_db/embeddings/onnx_export.py
Notes:
- Run on CPU (MPS does not support float64 used by the audio encoder's mel filterbank).
- Requires
transformers >= 5.x. The audio-feature accessor was renamed fromaudios=toaudio=between 4.x and 5.x; passing the old kwarg silently returns wrong shapes. tokenizer.jsonis the unmodified file from the upstream HF repo, fetched viaAutoTokenizer.from_pretrained(...).save_pretrained(...).
License
BSD-3-Clause โ same as the upstream LAION CLAP weights. See the upstream repo for the original notice and attribution.