Kotoba Technologies, Inc. Speech Team org
No description provided.
Kotoba Technologies, Inc. Speech Team org

Currently, the mic & file input takes the audio as is, but we need to convert the audio into 16kHz.

Kotoba Technologies, Inc. Speech Team org

The pipeline does resampling when it receives a file path instead of array, and gradio audio pass the filepath to the pipeline so it is resampling to 16kHz already.
https://github.com/huggingface/transformers/blob/09f9f566de83eef1f13ee83b5a1bbeebde5c80c1/src/transformers/pipelines/automatic_speech_recognition.py#L361

asahi417 changed pull request status to closed

Sign up or log in to comment