openai
/

whisper-large

Automatic Speech Recognition Transformers PyTorch TensorFlow JAX Safetensors 99 languages whisper audio hf-asr-leaderboard Eval Results Inference Endpoints

Model card Files Files and versions Community

Add "<|startoftranscript|>" to forced decoder ids

#14

by sanchit-gandhi HF staff - opened Dec 7, 2022

base: refs/heads/main

←

from: refs/pr/14

Discussion Files changed

-1

sanchit-gandhi

Dec 7, 2022

•

edited Dec 7, 2022

Replacing <|translate|><|notimestamps|> with <|startoftranscript|><|en|><|transcribe|><|notimestamps|>

Add "<|startoftranscript|>" to forced decoder idsb6a98530

ArthurZ

Dec 7, 2022

That's a pretty big change, you are also adding more tokens.
I think the reason why, by default we only have the 2 tokens is for testing purposes. I agree that depending on the usage we should rather hard-code them in the tests

ArthurZ

Dec 7, 2022

Also the reason why we don't have <|startoftranscript|> in the forced_decoder_ids is because it is set in decoder_start_token_id

sanchit-gandhi changed pull request status to closed Dec 7, 2022

sanchit-gandhi

Dec 7, 2022

We should set the language though in the forced decoder ids no? As we do for say the medium checkpoint:
https://huggingface.co/openai/whisper-medium/blob/main/config.json#L26-L39

For the large, we're currently setting <|translate|><|notimestamps|>

For all the other multilingual checkpoints, we're setting <|en|><|transcribe|><|notimestamps|>

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment