The default eos_token_id is 2, should be 11

#11
by mike-ravkine - opened

Hi Guys!

Fantastic model.

I have encountered an issue using both the 7B and 40B Falcon models with recommended settings, they continue generation past <|endoftext|>

Issue is that the default llama eos_token_id=2 is specified here: https://huggingface.co/tiiuae/falcon-40b-instruct/blob/main/configuration_RW.py#L41

Looking at https://huggingface.co/tiiuae/falcon-40b-instruct/raw/main/tokenizer.json this is not the llama vocabulary and token=2 is >>INTRODUCTION<< and I think we're looking for token=11 <|endoftext|>

I am able to work-around the generation problem by manually adding eos_token_id=11 on model invocation.

--Mike

Were you able to run the model on SageMaker?

Technology Innovation Institute org
edited Jun 1, 2023

Hey @mike-ravkine , glad you like the model

This is a bit surprising, while we should fix the default value, the config.json is correct since some days back, so when the model is loaded the config should be correct.
See: https://huggingface.co/tiiuae/falcon-40b-instruct/commit/662a9a4ffd96f4f73dd18141b60962f94b743c56

Could it be an issue with using a cached model since before it was fixed?

Thanks for the response @FalconLLM , this makes sense. I am actually using a quantized version of the model (from https://huggingface.co/TheBloke/falcon-40b-instruct-GPTQ) and it was missing the fix to config.json from above. I have opened a PR in that model!

mike-ravkine changed discussion status to closed

Sign up or log in to comment