--- language: "en" thumbnail: tags: - embeddings - Speaker - Verification - Identification - pytorch - xvectors - TDNN license: "apache-2.0" datasets: - voxceleb metrics: - EER - min_dct ---

# Speaker Verification with xvector embeddings on Voxceleb This repository provides all the necessary tools to extract speaker embeddings with a pretrained TDNN model using SpeechBrain. The system is trained on Voxceleb 1+ Voxceleb2 training data. For a better experience, we encourage you to learn more about [SpeechBrain](https://speechbrain.github.io). The given model performance on Voxceleb1-test set (Cleaned) is: | Release | EER(%) |:-------------:|:--------------:| | 05-03-21 | 3.2 | ## Pipeline description This system is composed of a TDNN model coupled with statistical pooling. The system is trained with Categorical Cross-Entropy Loss. ## Install SpeechBrain First of all, please install SpeechBrain with the following command: ``` pip install speechbrain ``` Please notice that we encourage you to read our tutorials and learn more about [SpeechBrain](https://speechbrain.github.io). ### Compute your speaker embeddings ```python import torchaudio from speechbrain.pretrained import EncoderClassifier classifier = EncoderClassifier.from_hparams(source="speechbrain/spkrec-xvect-voxceleb", savedir="pretrained_models/spkrec-xvect-voxceleb") signal, fs =torchaudio.load('samples/audio_samples/example1.wav') embeddings = classifier.encode_batch(signal) ``` ### Inference on GPU To perform inference on the GPU, add `run_opts={"device":"cuda"}` when calling the `from_hparams` method. ### Training The model was trained with SpeechBrain (aa018540). To train it from scratch follows these steps: 1. Clone SpeechBrain: ```bash git clone https://github.com/speechbrain/speechbrain/ ``` 2. Install it: ``` cd speechbrain pip install -r requirements.txt pip install -e . ``` 3. Run Training: ``` cd recipes/VoxCeleb/SpeakerRec/ python train_speaker_embeddings.py hparams/train_x_vectors.yaml --data_folder=your_data_folder ``` You can find our training results (models, logs, etc) [here](https://drive.google.com/drive/folders/1RtCBJ3O8iOCkFrJItCKT9oL-Q1MNCwMH?usp=sharing). ### Limitations The SpeechBrain team does not provide any warranty on the performance achieved by this model when used on other datasets. #### Referencing xvectors ```@inproceedings{DBLP:conf/odyssey/SnyderGMSPK18, author = {David Snyder and Daniel Garcia{-}Romero and Alan McCree and Gregory Sell and Daniel Povey and Sanjeev Khudanpur}, title = {Spoken Language Recognition using X-vectors}, booktitle = {Odyssey 2018}, pages = {105--111}, year = {2018}, } ``` #### Referencing SpeechBrain ``` @misc{SB2021, author = {Ravanelli, Mirco and Parcollet, Titouan and Rouhe, Aku and Plantinga, Peter and Rastorgueva, Elena and Lugosch, Loren and Dawalatabad, Nauman and Ju-Chieh, Chou and Heba, Abdel and Grondin, Francois and Aris, William and Liao, Chien-Feng and Cornell, Samuele and Yeh, Sung-Lin and Na, Hwidong and Gao, Yan and Fu, Szu-Wei and Subakan, Cem and De Mori, Renato and Bengio, Yoshua }, title = {SpeechBrain}, year = {2021}, publisher = {GitHub}, journal = {GitHub repository}, howpublished = {\url{https://github.com/speechbrain/speechbrain}}, } ``` #### About SpeechBrain SpeechBrain is an open-source and all-in-one speech toolkit. It is designed to be simple, extremely flexible, and user-friendly. Competitive or state-of-the-art performance is obtained in various domains. Website: https://speechbrain.github.io/ GitHub: https://github.com/speechbrain/speechbrain