YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

SoundSense AI - GroupComm SudoRM-RF Speech Separator

This is a fine-tuned/adapted model checkpoint for SoundSense AI.

Base model / source:

Training data:

  • LibriSpeech train-clean-360 (speech) + WHAM! noise (5000 real ambient noise recordings)
  • Synthetic 2/3-speaker noisy mixtures generated on-the-fly, random SNR -5 to +20 dB

Use:

  • Part of SoundSense AI hackathon submission (Stage 2: Speech Separation).
  • Isolates up to 3 simultaneous speakers from a noisy mixed-audio input.

Limitations:

  • Built for prototype/demo use.
  • Current SI-SNR: 2.51 dB (2-speaker noisy), 0.59 dB (3-speaker noisy) — below the target KPI of >18 dB / >10 dB respectively. Separation quality is not yet sufficient for clean speaker isolation; further training required.
  • Performance should be verified on the target environment before deployment.
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support