--- tags: - pytorch_model_hub_mixin - model_hub_mixin - gender-classification - VoxCeleb license: mit datasets: - ProgramComputer/voxceleb --- # Voice gender classifier - This repo contains the inference code to use pretrained human voice gender classifier. - You could also try 🤗[Huggingface online demo](https://huggingface.co/spaces/JaesungHuh/voice-gender-classifier). ## Installation First, clone the original [github repository](https://github.com/JaesungHuh/voice-gender-classifier) ``` git clone https://github.com/JaesungHuh/voice-gender-classifier.git ``` and install the packages via pip. ``` cd voice-gender-classifier pip install -r requirements.txt ``` ## Usage ``` import torch from model import ECAPA_gender # You could directly download the model from the huggingface model hub model = ECAPA_gender.from_pretrained("JaesungHuh/ecapa-gender") model.eval() # If you are using gpu .... device = torch.device("cuda" if torch.cuda.is_available() else "cpu") model.to(device) # Load the audio file and use predict function to directly get the output example_file = "data/00001.wav" with torch.no_grad(): output = model.predict(example_file, device=device) print("Gender : ", output) ``` ## Pretrained weights For those who need pretrained weights, please download it in [here](https://drive.google.com/file/d/1ojtaa6VyUhEM49F7uEyvsLSVN3T8bbPI/view?usp=sharing) ## Training details State-of-the-art speaker verification model already produces good representation of the speaker's gender. I used the pretrained ECAPA-TDNN from [TaoRuijie's](https://github.com/TaoRuijie/ECAPA-TDNN) repository, added one linear layer to make two-class classifier, and finetuned the model with the VoxCeleb2 dev set. The model achieved **98.7%** accuracy on the VoxCeleb1 identification test split. ## Caveat I would like to note the training dataset I've used for this model (VoxCeleb) may not represent the global human population. Please be careful of unintended biases when using this model. ## Reference - [Original github repository](https://github.com/JaesungHuh/voice-gender-classifier) - I modified the model architecture from [TaoRuijie's](https://github.com/TaoRuijie/ECAPA-TDNN) repository. - For more details about ECAPA-TDNN, check the [paper](https://arxiv.org/abs/2005.07143).