Do not add EOS token when tokenizine by default
#4
by
p1atdev
- opened
This PR reduces the confusing about tokenizer loading.
The current setting requires loading the tokenizer with add_eos_token=False
or the EOS token will be added automatically, leading to weird completion results.
- Before:
tokenizer = AutoTokenizer.from_pretrained("sbintuitions/sarashina2-8x70b", add_eos_token=False)
- After:
tokenizer = AutoTokenizer.from_pretrained("sbintuitions/sarashina2-8x70b")
"add_eos_token": false
in tokenizer_config.json
is the same as sbintuitions/sarashina2-70b
's.
https://huggingface.co/sbintuitions/sarashina2-70b/blob/main/tokenizer_config.json#L134
Thank you. LGTM!
kajyuuen
changed pull request status to
merged