Hervé Bredin commited on
Commit
89a2e1b
1 Parent(s): aaad0e6

feat: initial import

Browse files
Files changed (2) hide show
  1. README.md +57 -0
  2. config.yaml +18 -0
README.md ADDED
@@ -0,0 +1,57 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - pyannote
4
+ - audio
5
+ - voice
6
+ - speech
7
+ - speaker
8
+ - speaker-diarization
9
+ - speaker-change-detection
10
+ - voice-activity-detection
11
+ - overlapped-speech-detection
12
+ datasets:
13
+ - ami
14
+ - dihard
15
+ - voxconverse
16
+ - voxceleb
17
+ license: mit
18
+ inference: false
19
+ ---
20
+
21
+ # [pyannote.audio](https://github.com/pyannote/pyannote-audio) // speaker diarization
22
+
23
+ ```python
24
+ from pyannote.audio import Pipeline
25
+ pipeline = Pipeline.from_pretrained("pyannote/speaker-diarization")
26
+ output = pipeline("audio.wav")
27
+
28
+ for speech_turn, _, speaker in output.itertracks():
29
+ print(f"Speaker '{speaker}' speaks between t={speech_turn.start}s and t={speech_turn.end}s.")
30
+ ```
31
+
32
+ ## Benchmark
33
+
34
+ | Dataset | [Diarization error rate](http://pyannote.github.io/pyannote-metrics/reference.html#diarization) |
35
+ | --------------------------------------------------------------------------------------------------- | ------ |
36
+ | [AMI `only_words` evaluation set](https://github.com/BUTSpeechFIT/AMI-diarization-setup) | 21.3% |
37
+ | [DIHARD 3 evaluation set](https://arxiv.org/abs/2012.01477) | 22.2% |
38
+ | [VoxConverse 0.0.2 evaluation set](https://github.com/joonson/voxconverse) | 13.0% |
39
+
40
+ ## Support
41
+
42
+ For commercial enquiries and scientific consulting, please contact [me](mailto:herve@niderb.fr).
43
+ For [technical questions](https://github.com/pyannote/pyannote-audio/discussions) and [bug reports](https://github.com/pyannote/pyannote-audio/issues), please check [pyannote.audio](https://github.com/pyannote/pyannote-audio) Github repository.
44
+
45
+
46
+ ## Citation
47
+
48
+ ```bibtex
49
+ @inproceedings{Bredin2020,
50
+ Title = {{pyannote.audio: neural building blocks for speaker diarization}},
51
+ Author = {{Bredin}, Herv{\'e} and {Yin}, Ruiqing and {Coria}, Juan Manuel and {Gelly}, Gregory and {Korshunov}, Pavel and {Lavechin}, Marvin and {Fustes}, Diego and {Titeux}, Hadrien and {Bouaziz}, Wassim and {Gill}, Marie-Philippe},
52
+ Booktitle = {ICASSP 2020, IEEE International Conference on Acoustics, Speech, and Signal Processing},
53
+ Address = {Barcelona, Spain},
54
+ Month = {May},
55
+ Year = {2020},
56
+ }
57
+ ```
config.yaml ADDED
@@ -0,0 +1,18 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ pipeline:
2
+ name: pyannote.audio.pipelines.SpeakerDiarization
3
+ params:
4
+ segmentation: pyannote/segmentation
5
+ embedding: speechbrain/spkrec-ecapa-voxceleb
6
+ clustering: AgglomerativeClustering
7
+
8
+ params:
9
+ clustering:
10
+ method: average
11
+ threshold: 0.582398766878762
12
+ min_activity: 6.073193238899291
13
+ min_duration_off: 0.09791355693027545
14
+ min_duration_on: 0.05537587440407595
15
+ offset: 0.4806866463041527
16
+ onset: 0.8104268538848918
17
+ stitch_threshold: 0.04033955907446252
18
+