nehulagrawal commited on
Commit
019fd18
1 Parent(s): 8c6f372

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +70 -1
README.md CHANGED
@@ -25,8 +25,77 @@ It achieves the following results on the evaluation set:
25
 
26
  ## Model description
27
 
28
- More information needed
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
29
 
 
30
  ## Intended uses & limitations
31
 
32
  More information needed
 
25
 
26
  ## Model description
27
 
28
+ This segmentation model has been trained on English data (Callhome) using diarizers. It can be loaded with two lines of code:
29
+ ```python
30
+ from diarizers import SegmentationModel
31
+
32
+ segmentation_model = SegmentationModel().from_pretrained('diarizers-community/speaker-segmentation-fine-tuned-callhome-jpn')
33
+ ```
34
+
35
+ To use it within a pyannote speaker diarization pipeline, load the [pyannote/speaker-diarization-3.1](https://huggingface.co/pyannote/speaker-diarization-3.1) pipeline, and convert the model to a pyannote compatible format:
36
+
37
+ ```python
38
+
39
+ from diarizers import SegmentationModel
40
+ from pyannote.audio import Pipeline
41
+ from datasets import load_dataset
42
+ import torch
43
+
44
+ device = torch.device("cuda:0") if torch.cuda.is_available() else torch.device("cpu")
45
+
46
+ # load the pre-trained pyannote pipeline
47
+ pipeline = Pipeline.from_pretrained("pyannote/speaker-diarization-3.1")
48
+ pipeline.to(device)
49
+
50
+ model = SegmentationModel().from_pretrained("nehulagrawal/speaker-segmentation-eng")
51
+ model = model.to_pyannote_model()
52
+ pipeline._segmentation.model = model.to(device)
53
+ ```
54
+
55
+ You can now use the pipeline on audio examples:
56
+
57
+ ```python
58
+ from datasets import load_dataset
59
+ # load dataset example
60
+ dataset = load_dataset("diarizers-community/callhome", "eng", split="data")
61
+ sample = dataset[0]["audio"]
62
+
63
+ # pre-process inputs
64
+ sample["waveform"] = torch.from_numpy(sample.pop("array")[None, :]).to(device, dtype=model.dtype)
65
+ sample["sample_rate"] = sample.pop("sampling_rate")
66
+
67
+ # perform inference
68
+ diarization = pipeline(sample)
69
+
70
+ # dump the diarization output to disk using RTTM format
71
+ with open("audio.rttm", "w") as rttm:
72
+ diarization.write_rttm(rttm)
73
+ ```
74
+
75
+ You can now use the pipeline on single audio examples:
76
+
77
+ ```python
78
+
79
+ from diarizers import SegmentationModel
80
+ from pyannote.audio import Pipeline
81
+ from datasets import load_dataset
82
+ import torch
83
+
84
+ device = torch.device("cuda:0") if torch.cuda.is_available() else torch.device("cpu")
85
+
86
+ # load the pre-trained pyannote pipeline
87
+ pipeline = Pipeline.from_pretrained("pyannote/speaker-diarization-3.1")
88
+ pipeline.to(device)
89
+
90
+ model = SegmentationModel().from_pretrained("nehulagrawal/speaker-segmentation-eng")
91
+ model = model.to_pyannote_model()
92
+ pipeline._segmentation.model = model.to(device)
93
+
94
+ diarization = pipeline("audio.wav")
95
+ with open("audio.rttm", "w") as rttm:
96
+ diarization.write_rttm(rttm)
97
 
98
+ ```
99
  ## Intended uses & limitations
100
 
101
  More information needed