Training details
Hello and thanks for the good model!
If I understood well, after DPO on an English dataset, the model has been trained on Italian data.
Can you share more details about this step? I can't find the related script on GitHub...
Hi, you can find the DPO script here: https://github.com/marcopoli/LLaMAntino-3-ANITA/blob/main/model_adaptation/dpo_llama3.py
and the SFT script here: https://github.com/marcopoli/LLaMAntino-3-ANITA/blob/main/model_adaptation/finetune_llama3.py
Just change "model_name" and "dataset" accordingly. For the adaptation on the Italian language, just use the SFT script on a small portion of an Italian Data (e.g., gsarti/clean_mc4_it) using plain text without chat template, i.e. (<|begin_of_text|> {text} <|eot_id|><|end_of_text|>)
Thanks.
Very informative!
Hi @m-polignano-uniba ,
Is fine-tuning with the Italian language performed with QLoRA/LoRa or without?
Yes, we used QLoRA through Unsloth:
- load_in_4bit=True, r = 64, lora_alpha = 16, ["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"]
During the language adaptation phase, can you share a rough idea of the peak GPU VRAM usage?
In the paper, I read you used an Nvidia H100 64GB GPU but further details would be much appreciated.