Have these quants had their pre-tokenizer fixed?

by smcleod - opened May 1

May 1

Many llama 3 quantizations were created with a missing pre-tokenizer, has this been fixed in these quants?

llm_load_vocab: missing pre-tokenizer type, using: 'default'
llm_load_vocab: ************************************
llm_load_vocab: GENERATION QUALITY WILL BE DEGRADED!
llm_load_vocab: CONSIDER REGENERATING THE MODEL
llm_load_vocab: ************************************

bartowski

Owner May 1

•

edited May 1

They are based off of this commit which includes the BPE fixes:

https://github.com/ggerganov/llama.cpp/commit/ffe666572f98a686b17a2cd1dbf4c0a982e5ac0a

Is that a warning message you see when trying to load this one?

smcleod

May 2

Ohhh, gosh sorry I missed that! pls ignore 🤣

smcleod changed discussion status to closed May 2

bartowski

Owner May 2

No worries :D

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment