Finetuning

#1
by ewre324 - opened

Hi, I am new to the scene and hence a noob question. Can you point me to some resources for finetuning this model.

Hi, @ewre324 !

I appreciate your interest in this model!

I think you'll have better results finetuning the base model Felladrin/Minueza-32M-Base. In any case, I finetuned it using pure python, by following the instructions from HuggingFace's Supervised Fine-tuning Trainer.

Here's a quick example:

from datasets import load_dataset
from trl import SFTConfig, SFTTrainer

dataset = load_dataset("Felladrin/ChatML-deita-10k-v0", split="train")

sft_config = SFTConfig(
    dataset_text_field="text",
    max_seq_length=2048,
    output_dir="/trained-model",
)

trainer = SFTTrainer(
    "Felladrin/Minueza-32M-Base",
    train_dataset=dataset,
    args=training_args,
)

trainer.train()

I have uploaded several datasets using ChatML format, which are directly compatible with the Minueza model (i.e. the EOS Token is <|im_end|>).

But another easy way to fine-tune it is using the AutoTrain Advanced UI. Here's a screenshot:

image.png

Thanks @Felladrin for guiding me to correct resources.
Your script worked like magic. As per your suggestion I used the base model. Thanks also for sharing the link to HuggingFace's Supervised Fine-tuning Trainer.
I am yet to test autotrain will surely update you.
Thanks for helping me out.

Just out of curiosity, what was your use case to design this small sized model?

You're welcome!

My use case was running a non-quantized model on iOS browsers via Transformers.js (ONNX) :) And it works pretty well!
You can find more details here in this LinkedIn article.
But nowadays I'm using Wllama which allows me to run much larger models on the same iPhone device.

Sign up or log in to comment