metadata
license: apache-2.0
tags:
- speech-enhancement
- denoising
- coreml
- apple-silicon
- deepfilternet
library_name: speech-swift
DeepFilterNet3 — Core ML (FP16)
Real-time speech enhancement model for Apple Silicon. Removes background noise from speech audio.
- 2.1M params, FP16, ~4.2 MB
- Runs on Neural Engine via Core ML
- 48kHz native, 10ms frames
Latency (M2 Max)
| Duration | Time | RTF |
|---|---|---|
| 5s | 0.65s | 0.13 |
| 10s | 1.2s | 0.12 |
| 20s | 4.8s | 0.24 |
Usage
import SpeechEnhancement
let enhancer = try await SpeechEnhancer.fromPretrained()
let clean = try enhancer.enhance(audio: noisyAudio, sampleRate: 48000)
swift run audio denoise noisy.wav --output clean.wav
Files
DeepFilterNet3.mlpackage— Core ML FP16 model (Neural Engine)auxiliary.npz— ERB filterbank, Vorbis window, normalization states
Reference
- DeepFilterNet3
- Part of speech-swift