Error when generating multiple outputs using hugging face generation

#16
by DongfuJiang - opened

I do a top p sampling on this model, and I first run it on pure cpu. However, I get an [error](IndexError: index out of range in self) for the Llama embedtokens. I check token that cause this index error and found that, if you use hugging face generate() function to do the generation, it will automatically read the pad_token_id from config.json. And in that file, pad_token_id is set to -1, which causes this index error for the embeding.
I again checked the tokenizer pad_token_id from the tokenizer and and found that it's actually 0 instead of -1. So I guess this must be the error in the config.json file.
May managers take a look at this file and fix this?
image.png

image.png

I also find this batch generation problem and have no idea how to handle it, your solution works for me, thanks a lot!

Thank you! This also fixes my bug on LLaMa.

Sign up or log in to comment