Fine-tuning options?

#10
by yukiarimo - opened

How to fine-tune those models on a custom dataset?

@yukiarimo We have instructions for "instruction tuning" and "parameter-efficient finetuning" here: https://github.com/apple/corenet/tree/main/projects/openelm

@mchorton The PEFT fine-tuning configs in the mentioned repo are heavily customized for the commonsense reasoning datasets. Are there instructions on fine-tuning the models on a different dataset?

@mchorton The PEFT fine-tuning configs in the mentioned repo are heavily customized for the commonsense reasoning datasets. Are there instructions on fine-tuning the models on a different dataset?

this one talks about instruct tuning: https://github.com/apple/corenet/blob/main/projects/openelm/README-instruct.md

How to fine-tune those models on a custom dataset?

tried a full finetune with HuggingFace SFTTrainer, took 10' for 3 epochs of 4k conversational dataset (Open Assistant) on a 3090. loss looks good, trained model behaves as expected in my quick vibe check
code: https://github.com/geronimi73/3090_shorts/blob/main/nb_finetune-full_OpenELM-450M.ipynb

How to fine-tune those models on a custom dataset?

tried a full finetune with HuggingFace SFTTrainer, took 10' for 3 epochs of 4k conversational dataset (Open Assistant) on a 3090. loss looks good, trained model behaves as expected in my quick vibe check
code: https://github.com/geronimi73/3090_shorts/blob/main/nb_finetune-full_OpenELM-450M.ipynb

Thanks. That is what I need. Will try it later today.

This comment has been hidden

Thanks g-Ronimo, how much did this cost? (roughly)

10 minutes on a 3090? nothing, it's my own GPU at home

Sign up or log in to comment