Llama3 tokenizer
This was generated with the updated llama.cpp tokenizer conversion: https://github.com/ggerganov/llama.cpp/pull/6920
Please do test and let me know if you run into issues in this thread!
It seams to work for me, with the little testing i have done, setting the context length to 32768. But i have to alter the eos token to 128009 using the gguf-set-metadata.py script.
Thank you @SvenJoinH , that's helpful to log for any folks trying this out. I would've hoped that the hf->gguf script would copy over eos token when populating metadata but perhaps I missed something.
Thank you @SvenJoinH , that's helpful to log for any folks trying this out. I would've hoped that the hf->gguf script would copy over eos token when populating metadata but perhaps I missed something.
Sorry for being pedantic, but I'd like to be sure before downloading: you had fetched / pulled / built newest llama.cpp, and you used convert-hf-to-gguf.py (to f16 or f32?) before quantizing, is that right? Thanks.
@Whatever76474758585 it's a fair question after all of the confusion the last few days - but yes, this was built with a new version after 6920 was merged and using convert-hf-to-gguf.py with f16.
unrelated, but could you please rename the model page to have -GGUF in it?
@Bakanayatsu done