Model Card for Model ID

This is llama-3-8b ORPO finetuning for the italian language over the ultrafeedback italian dataset: mii-community/ultrafeedback-preferences-translated-ita

Model Details

Model Description

  • Developed by: Diego Giorgini
  • Funded by: AI Technologies SRL - www.aitechnologies.it
  • Language(s) (NLP): Italian
  • License: llama3
  • Finetuned from model: unsloth/llama-3-8b-Instruct-bnb-4bit

Training Details

Environment

unsloth: 2024.5
torch: 2.2

Training Data

mii-community/ultrafeedback-preferences-translated-ita is a selection of 55k rows of the ultrafeedback dataset, translated into italian with argotranslate.

Training Procedure

Preprocessing [optional]

  • No preprocessing has been performed, except for formatting with the llama3 chat_template from unsloth:

    tokenizer = get_chat_template(tokenizer, chat_template = "llama-3")

Training Hyperparameters

  • Training regime: 4bit

  • PEFT parameters:

  • Model loading parameters:

max_seq_length = 8192
dtype = None
load_in_4bit = True
r = 64  
lora_alpha = 64  
lora_dropout = 0  
bias = "none"  
random_state = 3407  
use_rslora = False  
loftq_config = None
  • ORPOConfig parameters:
max_length = 8192  
max_prompt_length = max_seq_length//2  
max_completion_length = max_seq_length//2  
warmup_ratio = 0.1  
weight_decay = 0.01  
per_device_train_batch_size = 1  
gradient_accumulation_steps = 16  
learning_rate=8e-6  
beta = 0.1  
optim = "paged_adamw_8bit"  
lr_scheduler_type = "linear"  
num_train_epochs = 1

Speeds, Sizes, Times

16h on an A100-40GB

Model Card Contact

diego.giorgini@icloud.com

Downloads last month
10
Safetensors
Model size
4.65B params
Tensor type
BF16
·
F32
·
U8
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train diegobit/llama-3-8b-Instruct-bnb-4bit-ita-orpo