Missing <|im_start|> from tokenizer_config.json
#3
by
bartowski
- opened
Curious why it's missing, it seems to break tokenization because it's not marked as being a special token
Adding it to tokenizer_config.json fixes my tokenization issue
Hi π @bartowski Hello, thank you very much! Could I see how you are specifically using it (for example, the inference code)? This would help us accurately reproduce your issue. Thanks again!
(closing this one to continue discussion in the PR)
bartowski
changed discussion status to
closed