HT-Demucs FT — Bass Specialist (PyTorch)

Bass isolation specialist from HT-Demucs FT, ~1/4 the size of the full ensemble.

This is sub-model 1 of the 4-bag htdemucs_ft ensemble by Défossez et al. (Meta AI), extracted as a standalone ~160 MB model. It produces the bass stem with the same quality as the full ensemble (median SDR 10.38 dB on MUSDB18-HQ — 2nd (close behind mdx_extra_q at 11.42) of all models in our 2026 benchmark) at roughly 1/4 the compute cost.

Want all 4 stems in one request? Use the full ensemble: StemSplitio/htdemucs-ft-pytorch

Want a hosted REST API with credits and a dashboard? Use the StemSplit API.


Why this model

Property This model Full htdemucs_ft bag
Disk size ~160 MB ~640 MB
Per-3-min-song latency (M4 Pro MPS) ~22 s (RTF 0.12) ~47 s (RTF 0.26)
Bass SDR on MUSDB18-HQ 10.38 dB 10.38 dB (identical — the bag's bass output IS this sub-model's output)
Other stems returned None (focused) All 4

If you only need the bass stem in production, this is strictly faster and smaller than the full ensemble with identical bass quality — ~2.6× faster wall time in our smoke tests on M4 Pro MPS.


Common use cases

  • Bassline transcription — extract bass for tab generation, MIDI conversion, or chord detection
  • Mix rebalancing — isolate and re-equalise the bass bus on a finished mix
  • Music education — learn basslines from any record by hearing them isolated
  • Sub-bass mastering reference — compare your low-end against pro mixes

Quick start (Python)

import base64, io, soundfile as sf
from huggingface_hub import InferenceClient

with open("your-song.mp3", "rb") as f:
    audio_b64 = base64.b64encode(f.read()).decode()

client = InferenceClient(model="StemSplitio/htdemucs-ft-bass-pytorch")
result = client.post(json={"inputs": audio_b64})

wav, sr = sf.read(io.BytesIO(base64.b64decode(result["bass"])))
sf.write("out_bass.wav", wav, sr)

Or run locally without Hugging Face at all:

import torch, soundfile as sf
from demucs.apply import apply_model
from demucs.audio import convert_audio
from demucs.pretrained import get_model

bag = get_model("htdemucs_ft")
model = bag.models[1].eval()  # the bass specialist
wav, sr = sf.read("your-song.mp3", dtype="float32", always_2d=True)
wav = torch.from_numpy(wav.T).contiguous()
wav = convert_audio(wav, sr, bag.samplerate, bag.audio_channels).unsqueeze(0)

with torch.no_grad():
    stems = apply_model(model, wav, device="mps" if torch.backends.mps.is_available() else "cpu")[0]

# bag.sources == ["drums", "bass", "other", "vocals"]; pick the bass row
sf.write("out_bass.wav", stems[bag.sources.index("bass")].T.numpy(), bag.samplerate)

Deploy on Hugging Face Inference Endpoints

Click Deploy → Inference Endpoints above, pick a GPU instance, and HF will spin up a container running handler.py.

Hardware Latency for 3-min song
NVIDIA L4 ~3 s
NVIDIA T4 small ~7 s
CPU x4 (basic) ~48 s

(Roughly 2.6× faster than the full-bag latency, since we run only this specialist sub-model. Cloud GPU numbers extrapolated from M4 Pro measurements.)

curl -X POST https://<your-endpoint>.endpoints.huggingface.cloud \
  -H "Authorization: Bearer $HF_TOKEN" \
  -H "Content-Type: application/json" \
  -d "{\"inputs\": \"$(base64 < your-song.mp3)\"}"

Try it in your browser, no code


Related models from StemSplit

Repo Stem When to use
htdemucs-ft-pytorch all 4 When you need vocals + drums + bass + other in one request
htdemucs-ft-vocals-pytorch vocals Best vocal SDR in our benchmark (9.19 dB) — karaoke, acapella
htdemucs-ft-drums-pytorch drums Drum extraction, beat transcription, sample-pack creation
htdemucs-ft-bass-pytorch bass Bassline transcription, mix rebalancing
htdemucs-ft-other-pytorch other / instrumental Karaoke instrumentals, sample-flipping, music-bed extraction

Full benchmark across every popular open-source separator: StemSplitio/stem-separation-benchmark-2026.


License & attribution

This repo is MIT-licensed, matching the original HT-Demucs.

Original authors (please cite if you use this model in research):

@inproceedings{rouard2023hybrid,
  title     = {Hybrid Transformers for Music Source Separation},
  author    = {Rouard, Simon and Massa, Francisco and D{\'e}fossez, Alexandre},
  booktitle = {ICASSP},
  year      = {2023}
}
  • Original model: facebookresearch/demucs
  • Packaging by StemSplit
  • Search keywords: bass extraction, isolate bass from song, bassline extractor, AI bass separator
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train StemSplitio/htdemucs-ft-bass-pytorch

Collection including StemSplitio/htdemucs-ft-bass-pytorch