Edit model card

This model has been fine-tuned with the continuous pretraining mode of Unsloth on the gsarti/clean_mc4_it dataset (only 100k rows) to improve the Italian language. The second fine-tuning was performed on the instructed dataset FreedomIntelligence/alpaca-gpt4-italian.

Uploaded model

  • Developed by: e-palmisano
  • License: apache-2.0
  • Finetuned from model : unsloth/Qwen2-1.5B-Instruct-bnb-4bit

Evaluation

For a detailed comparison of model performance, check out the Leaderboard for Italian Language Models.

Here's a breakdown of the performance metrics:

Metric hellaswag_it acc_norm arc_it acc_norm m_mmlu_it 5-shot acc Average
Accuracy Normalized 48.05 32.68 46.89 42.57

This qwen2 model was trained 2x faster with Unsloth and Huggingface's TRL library.

Downloads last month
3,020
Safetensors
Model size
1.54B params
Tensor type
BF16
·
Inference API
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for e-palmisano/Qwen2-1.5B-ITA-Instruct

Quantizations
1 model

Datasets used to train e-palmisano/Qwen2-1.5B-ITA-Instruct

Collection including e-palmisano/Qwen2-1.5B-ITA-Instruct