Training details

#1
by anakin87 - opened

Hello and thanks for the good model!

If I understood well, after DPO on an English dataset, the model has been trained on Italian data.
Can you share more details about this step? I can't find the related script on GitHub...

SWAP Research Group@UNIBA org

Hi, you can find the DPO script here: https://github.com/marcopoli/LLaMAntino-3-ANITA/blob/main/model_adaptation/dpo_llama3.py
and the SFT script here: https://github.com/marcopoli/LLaMAntino-3-ANITA/blob/main/model_adaptation/finetune_llama3.py
Just change "model_name" and "dataset" accordingly. For the adaptation on the Italian language, just use the SFT script on a small portion of an Italian Data (e.g., gsarti/clean_mc4_it) using plain text without chat template, i.e. (<|begin_of_text|> {text} <|eot_id|><|end_of_text|>)

Thanks.
Very informative!

Hi @m-polignano-uniba ,

Is fine-tuning with the Italian language performed with QLoRA/LoRa or without?

SWAP Research Group@UNIBA org

Yes, we used QLoRA through Unsloth:

  • load_in_4bit=True, r = 64, lora_alpha = 16, ["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"]

During the language adaptation phase, can you share a rough idea of the peak GPU VRAM usage?

In the paper, I read you used an Nvidia H100 64GB GPU but further details would be much appreciated.

Sign up or log in to comment