Instructions to use Hellfeu/echo-dia with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Hellfeu/echo-dia with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("Hellfeu/echo-dia", dtype="auto") - Notebooks
- Google Colab
- Kaggle
Echo Dia (V4)
Fine-tuned DiariZen-v2 (BUT-FIT/diarizen-wavlm-large-s80-md-v2) on a multi-domain meeting compound.
Training
- Base model: BUT-FIT/diarizen-wavlm-large-s80-md-v2
- Training data: 9.1 h compound (AMI 3.5h + AliMeeting 2.6h + NOTSOFAR 3.0h)
- Strategy: WavLM layer 23 unfrozen, lr_wavlm=2.5e-6, lr_head=1e-4
- Augmentation: SpecAugment (time + freq mask) + audio noise injection
- Duration: 60 minutes on RTX A6000 (Phase 3 winner V4)
- Best DER val (ES2011a, 18 min): 17.69%
Test set DER (collar=0, with overlap)
| Dataset | DER strict | DER col=0.25 | n_meetings |
|---|---|---|---|
| AMI test | 17.34% | 13.95% | 2 |
| AliMeeting test | 14.14% | 8.66% | 5 |
| NOTSOFAR test | 13.49% | 8.38% | 5 |
Usage
import torch
from diarizen.pipelines.inference import DiariZenPipeline
# Load v2 base, then inject Echo Dia weights
pipe = DiariZenPipeline.from_pretrained("BUT-FIT/diarizen-wavlm-large-s80-md-v2")
sd = torch.load("pytorch_model.bin", map_location="cuda:0", weights_only=False)
pipe._segmentation.model.load_state_dict(sd, strict=False)
# Run
result = pipe("audio.wav")
for seg, _, spk in result.itertracks(yield_label=True):
print(f"{seg.start:.1f}-{seg.end:.1f} {spk}")
License
CC BY-NC 4.0 (inherited from base model). Non-commercial use only.
- Downloads last month
- 6
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support