Additional output from the model apart from the transcribed text
I don't think we're planning on adding transcriptions on a segment-wise basis. Once this PR is merged we'll have utterance level time-stamps though
Hi @sanchit-gandhi , I am new on HF so this might be a silly question but how do you employ the changes introduced in the PR? Is it done centrally so that whisper-large-v2 eventually can be deployed with them?
Hey
@orangediamond
! You can get the latest changes by pip installing transformers from the main
branch (see https://huggingface.co/docs/transformers/installation#install-from-source):
pip install git+https://github.com/huggingface/transformers
Otherwise, a new PyPI package version (4.26.0) should be available later this week which has the latest changes:
pip install -U transformers
If you want the changes now you're better off installing from the main
branch! Otherwise keep tabs on https://github.com/huggingface/transformers/releases for the releases
Thanks for your reply @sanchit-gandhi ! I was actually not referring to local installation (sorry for being unclear) but rather to deploying the discussed changes as inference endpoints. As far as I can tell, the Whisper model that gets deployed does not contain the changes here - any pointers to that end? I might be misunderstanding how everything ties together in the Hugging Face ecosystem.
I have investigated a few things without luck:
- Setting
return_timestamps = True
with detailed parameters when calling the inference API does not seem to be possible - Creating a custom model with the option predefined in
generation_config.json
does not seem to work for me - I don't see a way to fork a model repository, and if I manually clone it I am unable to push the pretrained models using git LFS - Loading and invoking the model locally is unfortunately not viable
Not sure what to try at this point. If anyone (maybe @sanchit-gandhi - sorry for the extra ping) has ideas I'm all ears🙏
Hey @orangediamond ,
Thanks for clarifying and apologies for the delayed response! Indeed, it seems like detailed parameters are not supported for the ASR pipeline. Would a custom inference handler be better suited in this case? https://huggingface.co/docs/inference-endpoints/guides/custom_handler#create-custom-inference-handler
Here, you might be able to adjust the return_timestamps
arg and set it to True.
We just need to make sure that we've installed Transformers from main. To do so, we can add the following to our requirements file https://huggingface.co/docs/inference-endpoints/guides/custom_dependencies#add-custom-dependencies:
git+https://github.com/huggingface/transformers
This will install the main branch of transformers, which has the latest changes for Whisper.
Thanks for the suggestion! I do seem to be getting the same error when attempting to git push
anything to HF. This is what I have done:
- I cloned the whisper-large-v2 with
GIT_LFS_SKIP_SMUDGE=1
and used it as a template for my experiment.
λ GIT_LFS_SKIP_SMUDGE=1 git clone https://huggingface.co/openai/whisper-large-v2
- I created a custom handler as described in your link
from typing import Dict, List, Any
from transformers import pipeline
class EndpointHandler():
def __init__(self, path=""):
self.pipeline = pipeline(
"automatic-speech-recognition",
model="openai/whisper-large-v2",
chunk_length_s=30,
device=device,
)
def __call__(self, data: Dict[str, Any]) -> List[Dict[str, Any]]:
inputs = data.pop("inputs", data)
prediction = self.pipeline(inputs, return_timestamps=True)
return prediction
- I created
requirements.txt
withgit+https://github.com/huggingface/transformers
as the only dependency - I now have the following files in my repo:
λ ls
added_tokens.json generation_config.json merges.txt preprocessor_config.json README.md special_tokens_map.json tokenizer_config.json
config.json handler.py normalizer.json pytorch_model.bin requirements.txt tf_model.h5 vocab.json
- I
git add .
,git commit -m [...]
and attempt to push:
λ git push
Uploading LFS objects: 0% (0/2), 0 B | 0 B/s, done.
batch response: Repository not found
error: failed to push some refs to 'https://huggingface.co/orangediamond/whizper'
For reference I use the following remote:
λ git remote get-url --all origin
https://huggingface.co/orangediamond/whizper
I am clearly doing something wrong but it is unclear to me what exactly it is. Any ideas @sanchit-gandhi ?
Hey
@orangediamond
, sorry for the delayed response! The repo orangediamond/whizper
exists on the Hub before you do the Git push? I can't see it there (but it might be private). Could you make sure that you're correctly pushing to the target repo?
No worries @sanchit-gandhi . It does exist (privately) but I am not sure if it does so in the capacity that is required:
I would have imagined the above to suffice as far as being able to push to the repo goes?
Hey @orangediamond ! Indeed, that should be sufficient. Maybe worth trying clone the repo explicitly to your local device, moving all the files there and then pushing from within the repo?
Maybe out of context: would a HF Space suffice for your deployment? E.g. the one here: https://huggingface.co/spaces/sanchit-gandhi/whisper-large-v2
You can click "view API" at the bottom of the page to see how to send requests etc