bumstern commited on
Commit
dc3781c
1 Parent(s): b1dd720

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +44 -0
README.md CHANGED
@@ -1,3 +1,47 @@
1
  ---
2
  license: mit
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: mit
3
+ language:
4
+ - ru
5
+ library_name: pyannote-audio
6
+ tags:
7
+ - code
8
  ---
9
+
10
+ # Segmentation model
11
+
12
+ This model was trained on AMI-MixHeadset and my own synthetic dataset of Russian speech.
13
+
14
+ Training time: 5 hours on GTX 3060
15
+
16
+ This model can be used for diarization model from [pyannote/speaker-diarization](https://huggingface.co/pyannote/speaker-diarization)
17
+
18
+ | Benchmark | DER% |
19
+ | --------- |------|
20
+ | [AMI (*headset mix,*](https://groups.inf.ed.ac.uk/ami/corpus/) [*only_words*)](https://github.com/BUTSpeechFIT/AMI-diarization-setup) | 38.8 |
21
+
22
+ ## Usage example
23
+
24
+ ```python
25
+ import yaml
26
+ from yaml.loader import SafeLoader
27
+
28
+ import torch
29
+ from pyannote.audio import Model
30
+ from pyannote.audio.pipelines import SpeakerDiarization
31
+
32
+
33
+ segm_model = torch.load('model/segm_model.pth', map_location=torch.device('cpu'))
34
+ embed_model = Model.from_pretrained("pyannote/embedding", use_auth_token='ACCESS_TOKEN_GOES_HERE')
35
+ diar_pipeline = SpeakerDiarization(
36
+ segmentation=segm_model,
37
+ segmentation_batch_size=16,
38
+ clustering="AgglomerativeClustering",
39
+ embedding=embed_model
40
+ )
41
+
42
+ with open('model/config.yaml', 'r') as f:
43
+ diar_config = yaml.load(f, Loader=SafeLoader)
44
+ diar_pipeline.instantiate(diar_config)
45
+
46
+ annotation = diar_pipeline('audio.wav')
47
+ ```