Fine Tuning with QLora Overfitting

#26

by RicoRausch - opened 15 days ago

Discussion

RicoRausch

15 days ago

•

edited 14 days ago

Hello,
I am fine-tuning a model using Qlora, but I'm encountering overfitting issues. Increasing the dropout hasn't significantly improved the performance on the test set. I would appreciate any advice on how to mitigate overfitting. The project involves an OCR task aimed at extracting specific fields, and the model is particularly struggling with extracting addresses. Thank you for your help.

VictorSanh

HuggingFaceM4 org 14 days ago

Hi @RicoRausch

That sounds like a general question (i.e. not specific to idefics2 itself) that would be more suitable for the discussion forum (https://discuss.huggingface.co/). i could find a few discussions on overffitting.

Generally speaking, a few things you can explore without knowing too much about your problem: bigger weight decay, fine-tuning less parameters, doing early exit.

VictorSanh changed discussion status to closed 14 days ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment