Fix Conversational Widget

This PR will fix the widget for this model by setting explicit eos_token/bos_token. It should not affect existing users loading the model/tokenizers from the libraries.

PR changes have been generated using:
```py
from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("microsoft/DialoGPT-small")

tokenizer.save_pretrained('path/to/local/dir')
```

(related to PR [microsoft/DialoGPT-medium/discussions/17](https://huggingface.co/microsoft/DialoGPT-medium/discussions/16))

Files changed (1) hide show

tokenizer_config.json +21 -2

tokenizer_config.json CHANGED Viewed

@@ -1,4 +1,23 @@
 {
   "model_max_length": 1024,
-  "chat_template": "{% for message in messages %}{{ message.content }}{{ eos_token }}{% endfor %}"
-}

 {
+  "add_bos_token": false,
+  "add_prefix_space": false,
+  "added_tokens_decoder": {
+    "50256": {
+      "content": "<|endoftext|>",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    }
+  },
+  "bos_token": "<|endoftext|>",
+  "chat_template": "{% for message in messages %}{{ message.content }}{{ eos_token }}{% endfor %}",
+  "clean_up_tokenization_spaces": true,
+  "eos_token": "<|endoftext|>",
+  "errors": "replace",
   "model_max_length": 1024,
+  "pad_token": null,
+  "tokenizer_class": "GPT2Tokenizer",
+  "unk_token": "<|endoftext|>"
+}