Automatic Speech Recognition
pyannote.audio
pyannote
pyannote-audio-pipeline
audio
voice
speech
speaker
speaker-diarization
speaker-change-detection
voice-activity-detection
overlapped-speech-detection
Instructions to use DroolingPanda/speaker-diarization-community-1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- pyannote.audio
How to use DroolingPanda/speaker-diarization-community-1 with pyannote.audio:
from pyannote.audio import Pipeline pipeline = Pipeline.from_pretrained("DroolingPanda/speaker-diarization-community-1") # inference on the whole file pipeline("file.wav") # inference on an excerpt from pyannote.core import Segment excerpt = Segment(start=2.0, end=5.0) from pyannote.audio import Audio waveform, sample_rate = Audio().crop("file.wav", excerpt) pipeline({"waveform": waveform, "sample_rate": sample_rate}) - Notebooks
- Google Colab
- Kaggle
Copied from https://huggingface.co/pyannote/wespeaker-voxceleb-resnet34-LM
License
According to this page:
The pretrained model in WeNet follows the license of it's corresponding dataset. For example, the pretrained model on VoxCeleb follows Creative Commons Attribution 4.0 International License., since it is used as license of the VoxCeleb dataset, see https://mm.kaist.ac.kr/datasets/voxceleb/.
Citation
@inproceedings{Wang2023,
title={Wespeaker: A research and production oriented speaker embedding learning toolkit},
author={Wang, Hongji and Liang, Chengdong and Wang, Shuai and Chen, Zhengyang and Zhang, Binbin and Xiang, Xu and Deng, Yanlei and Qian, Yanmin},
booktitle={ICASSP 2023, IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
pages={1--5},
year={2023},
organization={IEEE}
}