Text Generation
Transformers
Safetensors
English
Italian
facebook
meta
pythorch
llama
llama-3
llamantino
Inference Endpoints
m-polignano-uniba's picture
Update README.md
61c5bdc verified
metadata
datasets:
  - gsarti/clean_mc4_it
  - Chat-Error/wizard_alpaca_dolly_orca
  - mlabonne/orpo-dpo-mix-40k
base_model: meta-llama/Meta-Llama-3-8B-Instruct
model_creator: Marco Polignano - SWAP Research Group
language:
  - en
  - it
metrics:
  - accuracy
pipeline_tag: text-generation
tags:
  - facebook
  - meta
  - pythorch
  - llama
  - llama-3
  - llamantino
library_name: transformers
license: llama3
llamantino3_anita

"Built with Meta Llama 3".

LLaMAntino-3-ANITA-8B-Inst-DPO-ITA is a model of the LLaMAntino - Large Language Models family. The model is an instruction-tuned version of Meta-Llama-3-8b-instruct (a fine-tuned LLaMA 3 model). This model version aims to be the a Multilingual Model ๐Ÿ (EN ๐Ÿ‡บ๐Ÿ‡ธ + ITA๐Ÿ‡ฎ๐Ÿ‡น) to further fine-tuning on Specific Tasks in Italian.

The ๐ŸŒŸANITA project๐ŸŒŸ *(Advanced Natural-based interaction for the ITAlian language)* wants to provide Italian NLP researchers with an improved model for the Italian Language ๐Ÿ‡ฎ๐Ÿ‡น use cases.


Model Details

https://github.com/marcopoli/LLaMAntino-3-ANITA



Specifications

  • Model developers:
    Ph.D. Marco Polignano - University of Bari Aldo Moro, Italy
    SWAP Research Group
  • Variations: The model release has been supervised fine-tuning (SFT) using QLoRA 4bit, on instruction-based datasets. DPO approach over the mlabonne/orpo-dpo-mix-40k dataset is used to align with human preferences for helpfulness and safety.
  • Input: Models input text only.
  • Language: Multilingual ๐Ÿ + Italian ๐Ÿ‡ฎ๐Ÿ‡น
  • Output: Models generate text and code only.
  • Model Architecture: Llama 3 architecture.
  • Context length: 8K, 8192.
  • Library Used: LLaMA.cpp

Prompt Template

<|start_header_id|>system<|end_header_id|>

{ SYS Prompt }<|eot_id|><|start_header_id|>user<|end_header_id|>

{ USER Prompt }<|eot_id|><|start_header_id|>assistant<|end_header_id|>

{ ASSIST Prompt }<|eot_id|>

ExLlamaV2

ExLlamaV2, a great tool that helps us easily Quantize your model in EXL2 format.

Citation instructions

@misc{polignano2024advanced,
      title={Advanced Natural-based interaction for the ITAlian language: LLaMAntino-3-ANITA}, 
      author={Marco Polignano and Pierpaolo Basile and Giovanni Semeraro},
      year={2024},
      eprint={2405.07101},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}
@misc{basile2023llamantino,
      title={LLaMAntino: LLaMA 2 Models for Effective Text Generation in Italian Language}, 
      author={Pierpaolo Basile and Elio Musacchio and Marco Polignano and Lucia Siciliani and Giuseppe Fiameni and Giovanni Semeraro},
      year={2023},
      eprint={2312.09993},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}
@article{llama3modelcard,
  title={Llama 3 Model Card},
  author={AI@Meta},
  year={2024},
  url = {https://github.com/meta-llama/llama3/blob/main/MODEL_CARD.md}
}