HT-Demucs FT — Bass Specialist (PyTorch)
Bass isolation specialist from HT-Demucs FT, ~1/4 the size of the full ensemble.
This is sub-model 1 of the 4-bag htdemucs_ft ensemble by
Défossez et al. (Meta AI), extracted as a standalone
~160 MB model. It produces the bass stem with the same quality as
the full ensemble (median SDR 10.38 dB on MUSDB18-HQ — 2nd (close behind mdx_extra_q at 11.42) of all
models in our 2026 benchmark) at roughly 1/4 the compute cost.
Want all 4 stems in one request? Use the full ensemble:
StemSplitio/htdemucs-ft-pytorchWant a hosted REST API with credits and a dashboard? Use the StemSplit API.
Why this model
| Property | This model | Full htdemucs_ft bag |
|---|---|---|
| Disk size | ~160 MB | ~640 MB |
| Per-3-min-song latency (M4 Pro MPS) | ~22 s (RTF 0.12) | ~47 s (RTF 0.26) |
| Bass SDR on MUSDB18-HQ | 10.38 dB | 10.38 dB (identical — the bag's bass output IS this sub-model's output) |
| Other stems returned | None (focused) | All 4 |
If you only need the bass stem in production, this is strictly faster and smaller than the full ensemble with identical bass quality — ~2.6× faster wall time in our smoke tests on M4 Pro MPS.
Common use cases
- Bassline transcription — extract bass for tab generation, MIDI conversion, or chord detection
- Mix rebalancing — isolate and re-equalise the bass bus on a finished mix
- Music education — learn basslines from any record by hearing them isolated
- Sub-bass mastering reference — compare your low-end against pro mixes
Quick start (Python)
import base64, io, soundfile as sf
from huggingface_hub import InferenceClient
with open("your-song.mp3", "rb") as f:
audio_b64 = base64.b64encode(f.read()).decode()
client = InferenceClient(model="StemSplitio/htdemucs-ft-bass-pytorch")
result = client.post(json={"inputs": audio_b64})
wav, sr = sf.read(io.BytesIO(base64.b64decode(result["bass"])))
sf.write("out_bass.wav", wav, sr)
Or run locally without Hugging Face at all:
import torch, soundfile as sf
from demucs.apply import apply_model
from demucs.audio import convert_audio
from demucs.pretrained import get_model
bag = get_model("htdemucs_ft")
model = bag.models[1].eval() # the bass specialist
wav, sr = sf.read("your-song.mp3", dtype="float32", always_2d=True)
wav = torch.from_numpy(wav.T).contiguous()
wav = convert_audio(wav, sr, bag.samplerate, bag.audio_channels).unsqueeze(0)
with torch.no_grad():
stems = apply_model(model, wav, device="mps" if torch.backends.mps.is_available() else "cpu")[0]
# bag.sources == ["drums", "bass", "other", "vocals"]; pick the bass row
sf.write("out_bass.wav", stems[bag.sources.index("bass")].T.numpy(), bag.samplerate)
Deploy on Hugging Face Inference Endpoints
Click Deploy → Inference Endpoints above, pick a GPU instance, and HF
will spin up a container running handler.py.
| Hardware | Latency for 3-min song |
|---|---|
| NVIDIA L4 | ~3 s |
| NVIDIA T4 small | ~7 s |
| CPU x4 (basic) | ~48 s |
(Roughly 2.6× faster than the full-bag latency, since we run only this specialist sub-model. Cloud GPU numbers extrapolated from M4 Pro measurements.)
curl -X POST https://<your-endpoint>.endpoints.huggingface.cloud \
-H "Authorization: Bearer $HF_TOKEN" \
-H "Content-Type: application/json" \
-d "{\"inputs\": \"$(base64 < your-song.mp3)\"}"
Try it in your browser, no code
Related models from StemSplit
| Repo | Stem | When to use |
|---|---|---|
htdemucs-ft-pytorch |
all 4 | When you need vocals + drums + bass + other in one request |
htdemucs-ft-vocals-pytorch |
vocals | Best vocal SDR in our benchmark (9.19 dB) — karaoke, acapella |
htdemucs-ft-drums-pytorch |
drums | Drum extraction, beat transcription, sample-pack creation |
htdemucs-ft-bass-pytorch |
bass | Bassline transcription, mix rebalancing |
htdemucs-ft-other-pytorch |
other / instrumental | Karaoke instrumentals, sample-flipping, music-bed extraction |
Full benchmark across every popular open-source separator: StemSplitio/stem-separation-benchmark-2026.
License & attribution
This repo is MIT-licensed, matching the original HT-Demucs.
Original authors (please cite if you use this model in research):
@inproceedings{rouard2023hybrid,
title = {Hybrid Transformers for Music Source Separation},
author = {Rouard, Simon and Massa, Francisco and D{\'e}fossez, Alexandre},
booktitle = {ICASSP},
year = {2023}
}
- Original model:
facebookresearch/demucs - Packaging by StemSplit
- Search keywords: bass extraction, isolate bass from song, bassline extractor, AI bass separator