Which tokenizer to use?

#1
by EZForever - opened

I've been trying to run this model with llama.cpp, but I've noticed that this repository is missing the tokenizer model. Since Mistral-7B-v0.2 is somehow gone, I have tried tokenizer models from Mistral-7B-v0.1 and Mistral-7B-Instruct-v0.2, but both were not working properly, with lots of missing tokens littering the result.

Is the tokenizer yet to be uploaded, or should I use another one? I'm a complete newbie to this, so please bear with me if it's a stupid question.

OpenBuddy org

Try to download the tokenizer files from this model:
https://huggingface.co/OpenBuddy/openbuddy-mistral2-7b-v20.2-32k

Thanks, can confirm the tokenizer from v20.2 works perfectly.

EZForever changed discussion status to closed

Sign up or log in to comment