--- license: mit language: - ru library_name: pyannote-audio tags: - code --- # Segmentation model This model was trained on AMI-MixHeadset and my own synthetic dataset of Russian speech. Training time: 5 hours on GTX 3060 This model can be used for diarization model from [pyannote/speaker-diarization](https://huggingface.co/pyannote/speaker-diarization) | Benchmark | DER% | | --------- |------| | [AMI (*headset mix,*](https://groups.inf.ed.ac.uk/ami/corpus/) [*only_words*)](https://github.com/BUTSpeechFIT/AMI-diarization-setup) | 38.8 | ## Usage example ```python import yaml from yaml.loader import SafeLoader import torch from pyannote.audio import Model from pyannote.audio.pipelines import SpeakerDiarization segm_model = torch.load('model/segm_model.pth', map_location=torch.device('cpu')) embed_model = Model.from_pretrained("pyannote/embedding", use_auth_token='ACCESS_TOKEN_GOES_HERE') diar_pipeline = SpeakerDiarization( segmentation=segm_model, segmentation_batch_size=16, clustering="AgglomerativeClustering", embedding=embed_model ) with open('model/config.yaml', 'r') as f: diar_config = yaml.load(f, Loader=SafeLoader) diar_pipeline.instantiate(diar_config) annotation = diar_pipeline('audio.wav') ```