sanchit-gandhi HF staff commited on
Commit
b65535b
1 Parent(s): baca495

Add pad token to tokenizer

Browse files

Whisper large is missing the pad token, which is otherwise added in the tiny-medium models (e.g. https://huggingface.co/openai/whisper-medium/blob/main/special_tokens_map.json#L125). This PR adds the pad token for the large checkpoint.

Files changed (1) hide show
  1. special_tokens_map.json +1 -0
special_tokens_map.json CHANGED
@@ -122,6 +122,7 @@
122
  "rstrip": false,
123
  "single_word": false
124
  },
 
125
  "unk_token": {
126
  "content": "",
127
  "lstrip": false,
 
122
  "rstrip": false,
123
  "single_word": false
124
  },
125
+ "pad_token": "<|endoftext|>",
126
  "unk_token": {
127
  "content": "",
128
  "lstrip": false,