should model(tokenizer(text)) work for bigcode/santacoder?

#13
by Dzmitry - opened
BigCode org

The bigcode/santacoder tokenizer produces token_type_ids tensor. AFAIK the model was not trained to receive it as input. So model(tokenizer(text)["input_ids"])works differently from model(tokenizer(text)) (the former seems correct whereas the latter seems at least risky).

BigCode org

Indeed the token_type_ids shouldn't be passed to the model, this PR prevents the tokenizer from returning it by default

cakiki changed discussion status to closed

Sign up or log in to comment