llava-surgery.py

#3
by jeiku - opened

When running the llamacpp command to split off the projector file I am met with this error:

image.png

I have also attempted to produce the GGUF mmproj file using your provided projector file and I receive this error:

image.png

Can you please advise me as to the best method for producing a llamacpp/koboldcpp compatible mmproj file? Without this file, we are unable to provide our users with a coherent vision projector that works with all Llama 3 8B models.

For reference, we were able to produce the file with no issues using this model https://huggingface.co/weizhiwang/LLaVA-Llama-3-8B but we are looking for the improved capabilities of your model.

I have attempted running llava-surgery-v2.py and come up against this error:

image.png

I have checked your config and see this line:

image.png

Am I to understand that this is not, in fact, a Llava model?

image.png
this is a regular llama 3 that has never seen multimodality?

image.png
yea this pretty much confirms it.

I have succeeded in converting the projector to GGUF by first converting the safetensors to bin. Unfortunately the mmproj output by this process crashed on load in both llamacpp and koboldcpp. Please advise.

xtuner org
edited Apr 23

@jeiku Hi!
We have released weights similar to LLaVA v1.5/v1.6 architecture here. You can try this model with your workflow.

At the same time, our team is also working on the gguf conversion. So we really look forward to your feedback!

Thanks again!

xtuner org

I found that maybe we should follow the LLaVA-v1.6's conversion steps and use llava-surgery-v2.py, because this model uses LoRA finetuning on clip-vit.

https://github.com/ggerganov/llama.cpp/blob/master/examples/llava/README.md#llava-16-gguf-conversion

I found that maybe we should follow the LLaVA-v1.6's conversion steps and use llava-surgery-v2.py, because this model uses LoRA finetuning on clip-vit.

https://github.com/ggerganov/llama.cpp/blob/master/examples/llava/README.md#llava-16-gguf-conversion

Following the 1.6, method I am left with a mmproj file which shows this error after image upload:

image.png

Following the 1.5 method, I am left with an mmproj which works, but has no stop token and generates 500 tokens of text every time.

Can you please upload a working mmproj if you have produced one.

xtuner org
edited Apr 24

@jeiku Thanks very much for your effort!

We are currently developing llava-phi-3-mini models. After the release of the new llava models (expected in 1-2 days), we will proceed with the conversion to gguf.

xtuner org

Sign up or log in to comment