Spaces:

kotoba-tech
/

kotoba-whisper-demo

Running on T4

fix_sampling_rate

by asahi417 - opened Apr 16

←

Kotoba Technologies org Apr 16

No description provided.

Kotoba Technologies org Apr 16

Currently, the mic & file input takes the audio as is, but we need to convert the audio into 16kHz.

Kotoba Technologies org Apr 17

The pipeline does resampling when it receives a file path instead of array, and gradio audio pass the filepath to the pipeline so it is resampling to 16kHz already.
https://github.com/huggingface/transformers/blob/09f9f566de83eef1f13ee83b5a1bbeebde5c80c1/src/transformers/pipelines/automatic_speech_recognition.py#L361

asahi417 changed pull request status to closed Apr 17

Kotoba Technologies org Apr 17

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment