convert llava-v1.5-7b to liuhaotian/llava-v1.5-7b-hf format

#26
by deleted - opened
deleted

Thank you for your outstanding work. I recently fine-tuned the Llava model based on the liuhaotian/llava-v1.5-7b model. Now, I want to adapt the Llava model using the VLLM framework to improve inference speed. I found that VLLM uses files in the format of llava-v1.5-7b-hf. I want to know how to convert my fine-tuned Llava-v1.5-7b model to the llava-v1.5-7b-hf format. Because if I directly load the Llava-v1.5-7b model using VLLM, I will get an error saying "Model architectures ['LlavaLlamaForCausalLM'] are not supported for now". So I must do the conversion. I want to know how the llava-v1.5-7b-hf format is obtained.

Llava Hugging Face org

Hi,

We recommend to leverage the conversion script, found here: https://github.com/huggingface/transformers/blob/main/src/transformers/models/llava/convert_llava_weights_to_hf.py.

However, I also recommend to verify logits after conversion on the same inputs. I noticed the original LLaVa model pads images whereas the image processor in Transformers doesn't yet.

deleted

Hi,

We recommend to leverage the conversion script, found here: https://github.com/huggingface/transformers/blob/main/src/transformers/models/llava/convert_llava_weights_to_hf.py.

However, I also recommend to verify logits after conversion on the same inputs. I noticed the original LLaVa model pads images whereas the image processor in Transformers doesn't yet.

Thank you for your reply. I'll give it a try later. If successful, I'll update the instructions here.

Llava Hugging Face org

Hi,

We recommend to leverage the conversion script, found here: https://github.com/huggingface/transformers/blob/main/src/transformers/models/llava/convert_llava_weights_to_hf.py.

However, I also recommend to verify logits after conversion on the same inputs. I noticed the original LLaVa model pads images whereas the image processor in Transformers doesn't yet.

Thank you for your reply. I'll give it a try later. If successful, I'll update the instructions here.

Hello, have you succeeded? If so, can you briefly tell me what to do?Thank you for your reply.

deleted

Hi,

We recommend to leverage the conversion script, found here: https://github.com/huggingface/transformers/blob/main/src/transformers/models/llava/convert_llava_weights_to_hf.py.

However, I also recommend to verify logits after conversion on the same inputs. I noticed the original LLaVa model pads images whereas the image processor in Transformers doesn't yet.

Thank you for your reply. I'll give it a try later. If successful, I'll update the instructions here.

Hello, have you succeeded? If so, can you briefly tell me what to do?Thank you for your reply.

Following the instructions provided by nielsr's link is correct. The steps outlined there are very detailed.

deleted

Btw, I just uploaded a fine-tuning notebook for LLaVa with Transformers here: https://github.com/NielsRogge/Transformers-Tutorials/blob/master/LLaVa/Fine_tune_LLaVa_on_a_custom_dataset_(with_PyTorch_Lightning).ipynb

The operation link you provided is correct, thank you. Also, do you happen to know how the llava-next project is fine-tuned? Because the official documentation does not provide specific fine-tuning code(https://github.com/LLaVA-VL/LLaVA-NeXT/).

LLaVa-NeXT is very similar to LLaVa and can be fine-tuned with the same script by adding a few changes.

I edited the provided notebook to adapt for LLaVa-NeXT: Colab Notebook

deleted

LLaVa-NeXT is very similar to LLaVa and can be fine-tuned with the same script by adding a few changes.

I edited the provided notebook to adapt for LLaVa-NeXT: Colab Notebook

Great, thank you for your work. However, in fact, I am more interested in the model fine-tuning process for llava-next-video. Do you have any suggestions? Or could you create a similar Jupyter notebook for fine-tuning?

We haven't added LLaVa-NeXT-Video to transformers yet

From Video-LLMs there is Video-LLaVa, I am working on adding a fine-tune script for it. Will let you know here when it's ready

@Dengxiaoyu, I added a tutorial on tuning Video-LLaVa in this Colab notebook

deleted

@Dengxiaoyu, I added a tutorial on tuning Video-LLaVa in this Colab notebook

Thank you for your enthusiastic help. If possible, I would also appreciate it if you could create a fine-tuning code for llava-next-video.

It is not yet added to transformers. We are planning to work on adding and creating notebooks for Llava-Next-Video next month

Btw, I just uploaded a fine-tuning notebook for LLaVa with Transformers here: https://github.com/NielsRogge/Transformers-Tutorials/blob/master/LLaVa/Fine_tune_LLaVa_on_a_custom_dataset_(with_PyTorch_Lightning).ipynb

May I ask if there are any plans for transformers to support Llava-Next-Video?

As per the last conversation with the authors, they want to release a better version before adding it in transformers. You can track the issue here

Sign up or log in to comment