why the "vocab_size" in config file is 50272 but the len(tokenizer) is 50265

#16
by zchill - opened

why the "vocab_size" in config file is 50272 but the len(tokenizer) is 50265. @patrickvonplaten

zchill changed discussion status to closed

Did you find an answer? I think it's the same reason as for T5 here: https://github.com/huggingface/transformers/issues/4875#issuecomment-647634437

Sign up or log in to comment