Llama3 tokenizer

#1
by 3thn - opened
Crusoe AI org
edited Apr 29

This was generated with the updated llama.cpp tokenizer conversion: https://github.com/ggerganov/llama.cpp/pull/6920

Please do test and let me know if you run into issues in this thread!

It seams to work for me, with the little testing i have done, setting the context length to 32768. But i have to alter the eos token to 128009 using the gguf-set-metadata.py script.

Crusoe AI org

Thank you @SvenJoinH , that's helpful to log for any folks trying this out. I would've hoped that the hf->gguf script would copy over eos token when populating metadata but perhaps I missed something.

Thank you @SvenJoinH , that's helpful to log for any folks trying this out. I would've hoped that the hf->gguf script would copy over eos token when populating metadata but perhaps I missed something.

Sorry for being pedantic, but I'd like to be sure before downloading: you had fetched / pulled / built newest llama.cpp, and you used convert-hf-to-gguf.py (to f16 or f32?) before quantizing, is that right? Thanks.

Crusoe AI org

@Whatever76474758585 it's a fair question after all of the confusion the last few days - but yes, this was built with a new version after 6920 was merged and using convert-hf-to-gguf.py with f16.

unrelated, but could you please rename the model page to have -GGUF in it?

Crusoe AI org

Sign up or log in to comment