bumstern
/

segmentation_model_russian_data

Model card Files Files and versions Community

segmentation_model_russian_data / README.md

bumstern's picture

Update README.md

dc3781c about 1 year ago

|

history blame contribute delete

1.26 kB

	---
	license: mit
	language:
	- ru
	library_name: pyannote-audio
	tags:
	- code
	---

	# Segmentation model

	This model was trained on AMI-MixHeadset and my own synthetic dataset of Russian speech.

	Training time: 5 hours on GTX 3060

	This model can be used for diarization model from [pyannote/speaker-diarization](https://huggingface.co/pyannote/speaker-diarization)

	\| Benchmark \| DER% \|
	\| --------- \|------\|
	\| [AMI (headset mix,](https://groups.inf.ed.ac.uk/ami/corpus/) [only_words)](https://github.com/BUTSpeechFIT/AMI-diarization-setup) \| 38.8 \|

	## Usage example

	```python
	import yaml
	from yaml.loader import SafeLoader

	import torch
	from pyannote.audio import Model
	from pyannote.audio.pipelines import SpeakerDiarization


	segm_model = torch.load('model/segm_model.pth', map_location=torch.device('cpu'))
	embed_model = Model.from_pretrained("pyannote/embedding", use_auth_token='ACCESS_TOKEN_GOES_HERE')
	diar_pipeline = SpeakerDiarization(
	segmentation=segm_model,
	segmentation_batch_size=16,
	clustering="AgglomerativeClustering",
	embedding=embed_model
	)

	with open('model/config.yaml', 'r') as f:
	diar_config = yaml.load(f, Loader=SafeLoader)
	diar_pipeline.instantiate(diar_config)

	annotation = diar_pipeline('audio.wav')
	```