The tokenizer config does not match the version shared by the original author

#17
by GohioAC - opened

Original config: https://huggingface.co/liuhaotian/llava-v1.6-34b-tokenizer/blob/main/tokenizer_config.json
Config provided here: https://huggingface.co/llava-hf/llava-v1.6-34b-hf/blob/main/tokenizer_config.json

There is a huge difference in the 2 configs. The most concerning part is that the pad token is different.

Llava Hugging Face org

Hey! You should be looking at https://huggingface.co/liuhaotian/llava-v1.6-34b/blob/main/tokenizer_config.json for the original config, as that is the one loaded when generating with LLaVa.

In that case there's only one difference between the two: HF implementation has a special "" token used internally to inject image embeddings, which does not affect the generation in any way

Sign up or log in to comment