Use V1 tokenizer instead
#10
by
Rocketknight1
HF staff
- opened
No description provided.
Rocketknight1
changed pull request title from
Upload tokenizer
to Use V1 tokenizer instead
There was an issue with the last PR - we used the V3 tokenizer, but this base model actually uses the V1 tokenizer. This should fix the issue!
@Rocketknight1 does it affect the vocab size? Model and tokenizer sizes are not matching. So model is failing to load.
@lbathen
can you give me some code to reproduce that issue? From here it looks like the tokenizer and the model both have a vocab size of 32000
@Rocketknight1 I confirmed that both show same vocab of 32K now. I had pulled the wrong revision :)