eos_token_id discrepency

by edmond - opened Mar 22

Mar 22

Hello, I don't know if this was intended, but model.generation_config.eos_token_id != tokenizer.eos_token_id. This discrepency can create a problem when training the model while adding the tokenizer's eos token and expecting the generation to stop with a different token.

Oyoy1235

Mar 22

•

edited Mar 22

Hi, thanks for your interest to imp. In our tests, model.generation_config.eos_token_idand tokenizer.eos_token_id are using the same token id </s>.
And can you tell me the version of your tokneizer package, this could be help for finding the problem.

edmond

Apr 2

No thank you you, it works great for me, and still even Nvidia cannot beat you https://twitter.com/Jacoed/status/1770795726877990953.

tok.eos_token_id
50256

llm.generation_config.eos_token_id
50295

from transformers import ViltProcessor, AutoModel, AutoModelForCausalLM, AutoTokenizer

import transformers

transformers.version
'4.39.0'

The tokenizer comes from transformers for me, does that answer your question @Oyoy1235 ?

Oyoy1235

Apr 23

Great, I think we don't have other questions in eos_token_id.

edmond changed discussion status to closed Apr 23

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment