Differences with mistral-7b-v0.2?

#7
by mallorbc - opened

From my understanding, the v0.2 model also had a 32k context window(without sliding window). Is the only difference here then the different tokenizers?

yes im also confused ?
was this given other unicode characters ?
such as chinese and japanese and sancrit and amarhiric ? ( the non standards ?) was it trained on multi lingugal ?? what are the actual changes ?

Mistral AI_ org

Hi there, the main changes are as mentionned the improved tokenizer and so the higher vocabulary!

The link to the v0.2 model is dead in the README. No way to figure out what has changed since v0.1. No documentation of basic model characteristics in the README. Paper not cited (probably v0.1: https://arxiv.org/pdf/2310.06825; one for v0.3?).

If I could issue a wish: Please document you models better.

the real thought is can it be swapped ? ...
But now i dont mind as i thin the embeddings are also connected to th tokenizer so i get funny output when i did swap ! this is why i asked... as my model is the top mistral 2 on the leader board for 7b..

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment