facebook/wav2vec2-base-960h · model performs poorly on local machine

Jun 22, 2022

I followed the installation guide and cloned the repo to my machine.

But the model is completely wrong when i run the code below. but if i instead use the online test option on the models front page, i get perfect results. the audio file says "test one two three" which is exactly correct in the browser test, but completely wrong on the local model. any suggestions would be greatly appreciated

Hemanth-thunder

Jun 23, 2022

•

edited Jun 23, 2022

hi mate i also faced same issue for longer wav audio ,

remove noise
do chunk for longer audio
use torch.nograd()
4.using librose

sanchit-gandhi

Jun 28, 2022

•

edited Jun 29, 2022

Hi @jonasislive ! Have you verified that the sampling rate of the .wav file matches the sampling rate of the Wav2Vec2 processor? If not, you will need to resample the audio to 16kHz.

For reference, you can paste fenced code blocks by placing triple backticks ``` before and after the code. As an example:

import numpy as np

More info can be found in the docs.

Pasting code helps the community in running code and reproducing results! If you're able to modify the code snippet such that it is reproducible (i.e. we can run the code with the queried data) we'll be able to get to the bottom of this much more quickly! You could try uploading the .wav file to the Hub as a dataset (https://huggingface.co/docs/datasets/loading#inmemory-data):

from datasets import Dataset

audio_dict = {"audio": {"array": audio, "sampling_rate": _}}
dataset = Dataset.from_dict(audio_dict)
dataset.push_to_hub("dummy_audio")

Thanks!