Fine Tuning LLAVA for classification

#24
by J812 - opened

Hello, Thank you for this great model :) I tried to use LLAVA as a classification model, where I needed to classify pictures in multiple categories, so far it seems that the model performed well for classification in couple of the categories but not in all of them.
I am wondering if it would be possible to finetune this model with my own data sets for better performance for my use case. If so could you please refer me to a documentation/piece of code on how I can to fine-tune this model ? Thank you :)

Llava Hugging Face org

Definitely! Although for image classification LLaVa might be a bit of an overkill. One could probably just leverage smaller image classifiers like ConvNeXT or SigLIP for this purpose.

See the demo notebooks: https://github.com/huggingface/notebooks/blob/main/examples/image_classification.ipynb.

If you want to fine-tune LLaVa, here's a demo script for that using the TRL library: https://github.com/huggingface/trl/blob/main/examples/scripts/vsft_llava.py

Sign up or log in to comment