How do you fine tune LLaVA NeXT?

#5
by Nishgop - opened

Is there a way to fine tune LLaVA-NeXT?

Llava Hugging Face org

cc @lewtun the TRL team is going to make it super easy to fine-tune models like these.

For now I'll refer you to my demo notebook, which includes a bunch of utilities from the original LLaVa repository.

Thanks Niels, This is great!
I assume the same approach works also for LLaVA-NeXT. Is that correct?

Nishant

Llava Hugging Face org

Yes it should, although Llava-NeXT is a bit more complex compared to Llava in terms of image preprocessing. A PR to add batched generation (which should also solve training issues) is here: https://github.com/huggingface/transformers/pull/29850.

For now I'd recommend either Llava or Idefics2. Refer to my demo notebook: https://github.com/NielsRogge/Transformers-Tutorials/blob/master/Idefics2/Fine_tune_Idefics2_for_JSON_extraction_use_cases_(PyTorch_Lightning).ipynb. Have tested this with both models.

Sign up or log in to comment