Is it not possible to load and use a previous version of LLaVA with Transformers?

#4
by PerRing - opened

I have trained my own LLaVA model with previous LLaVA code (https://github.com/haotian-liu/LLaVA)
Is there a way to use a previous version of LLaVA with Transformers(using below code)?

from transformers import AutoProcessor, LlavaForConditionalGeneration
model = LlavaForConditionalGeneration.from_pretrained(PATH)
processor = AutoProcessor.from_pretrained(PATH)
Llava Hugging Face org
Llava Hugging Face org

@PerRing let us know when you managed to convert the weights and push them on the Hub !

Llava Hugging Face org

You need to save the raw state dict of your previous model on the Hub in order for that script to work

Llava Hugging Face org

For example here: https://huggingface.co/ybelkada/test-llava-13b/tree/main I stored the raw state dict of the previous llava 13B model, if you do the same for your model the conversion script should work smoothly

I'm not sure, but I think line 58 should be changed.

tokenizer.add_tokens(AddedToken("<image>", special=True, normalized=False), special=True)

tokenizer.add_tokens(AddedToken("<image>", special=True, normalized=False))

because of

TypeError: SpecialTokensMixin.add_tokens() got an unexpected keyword argument 'special'

+)I fix line 58. but error happend in line 69.
ValueError: Unrecognized configuration class <class 'transformers.models.llava.configuration_llava.LlavaConfig'> for this kind of AutoModel: AutoModelForCausalLM.

++)My model is based on LLaMA2 13B, but i expended vocab size(vocab size is 39479). Do you think this script works on my model?

Llava Hugging Face org
  1. Okay we nee to use special_tokens = True
  2. You need to install fromn source
  3. The script should work, if not we will make the vocab size be retrieved from the model but it is I think

@ArthurZ @nielsr I got the same problem as @PerRing and the weights cannot be converted as the script fails with those errors. My transformers lib is installed from source and it seems that there is no LlavaModelForCausalLM registered at all. Is it possible to fix this? If this can be fixed and we know that the performance is the same as the original llava, users will have more incentive to use the hf version over the original one. Otherwise, all previous improvements in llava cannot be used in HF unfortunately. I see there is some discussion here https://github.com/huggingface/transformers/pull/27662#discussion_r1416880859 would it be possible to come up with a fix? Thanks!

Sign up or log in to comment