Add tokenizer.model from the base model

#3

Add tokenizer.model from the base model (mistralai/Mistral-7B-v0.1)

MaralGPT org

Just a quick question (because we're probably going to merge this PR), is this tokenizer performing better than the current one?

This is the vanilla mistral-7b-v0.1 tokenizer.model file which is missing in the current maral-7b repo, so the repo in it's current state was not usable for me.
I was quantizing this model and encountered spm tokenizer model not found‍ error in the process, this solves the issue.

Also I plan to work on researching/training a better tokenizer for persian text and hopefully can help with that in near future.

MaralGPT org

Thanks for the descriptions provided, merged.

Muhammadreza changed pull request status to merged

Sign up or log in to comment