Spaces:

kotoba-speech
/

kotoba-whisper-demo

Running on Zero

fix_sampling_rate

by asahi417 - opened Apr 16, 2024

←

Apr 16, 2024

No description provided.

Apr 16, 2024

Currently, the mic & file input takes the audio as is, but we need to convert the audio into 16kHz.

Apr 17, 2024

The pipeline does resampling when it receives a file path instead of array, and gradio audio pass the filepath to the pipeline so it is resampling to 16kHz already.
https://github.com/huggingface/transformers/blob/09f9f566de83eef1f13ee83b5a1bbeebde5c80c1/src/transformers/pipelines/automatic_speech_recognition.py#L361

asahi417 changed pull request status to closed Apr 17, 2024

Apr 17, 2024

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment