<|im_start|> missing status of special token
#4
by
xxxTEMPESTxxx
- opened
Correct me if i am wrong but
"50296": {
"content": "<|im_start|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
from what i have seen with other models using open ai chat format , im_start is given a special token status to tokenizer which is missing on all phi-2 finetunes that are enforcing chatml format , quite starnge as it's basically making model's life harder