Prediction size vs tokenizer size

#3
by surya-narayanan - opened

Hi, the model seems to output a tensor of size batchsize x sentence size x 78672 but the tokenizer vocab size is 50265. Any idea why there's this discrepancy?

Hi @surya-narayanan , MathBERTa's tokenizer was largely extended to cover the math vocabulary. At the time of training, this was not fully supported by transformers, so some inconsistencies like this can still be ocassionally found. FWIW, later on, we opened and merged related PR to transformers library.

Anyway, I've checked for you that model's config matches the len(tokenizer.vocab), so the current model and tokenizer should be good to be used together as-is.

hmm, still facing an error- should i re-install transformers?

Sign up or log in to comment