--- datasets: - gsarti/clean_mc4_it - Chat-Error/wizard_alpaca_dolly_orca - mlabonne/orpo-dpo-mix-40k base_model: meta-llama/Meta-Llama-3-8B-Instruct model_creator: Marco Polignano - SWAP Research Group language: - en - it metrics: - accuracy pipeline_tag: text-generation tags: - facebook - meta - pythorch - llama - llama-3 - llamantino library_name: transformers license: llama3 --- llamantino3_anita

"Built with Meta Llama 3".

LLaMAntino-3-ANITA-8B-Inst-DPO-ITA is a model of the LLaMAntino - Large Language Models family. The model is an instruction-tuned version of Meta-Llama-3-8b-instruct (a fine-tuned LLaMA 3 model). This model version aims to be the a Multilingual Model 🏁 (EN 🇺🇸 + ITA🇮🇹) to further fine-tuning on Specific Tasks in Italian.

The 🌟**ANITA project**🌟 *(**A**dvanced **N**atural-based interaction for the **ITA**lian language)* wants to provide Italian NLP researchers with an improved model for the Italian Language 🇮🇹 use cases.

## Model Details

[https://github.com/marcopoli/LLaMAntino-3-ANITA](https://github.com/marcopoli/LLaMAntino-3-ANITA)

- [**Full Model: LaMAntino-3-ANITA-8B-Inst-DPO-ITA**](https://huggingface.co/swap-uniba/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA) - ExLlamaV2 - **3.0bpw model** - ExLlamaV2 - **4.0bpw model** - ExLlamaV2 - **4.5bpw model** - ExLlamaV2 - **measurement.json**

## Specifications - **Model developers**:
Ph.D. Marco Polignano - University of Bari Aldo Moro, Italy
SWAP Research Group
- **Variations**: The model release has been **supervised fine-tuning (SFT)** using **QLoRA** 4bit, on instruction-based datasets. **DPO** approach over the *mlabonne/orpo-dpo-mix-40k* dataset is used to align with human preferences for helpfulness and safety. - **Input**: Models input text only. - **Language**: Multilingual 🏁 + Italian 🇮🇹 - **Output**: Models generate text and code only. - **Model Architecture**: *Llama 3 architecture*. - **Context length**: 8K, 8192. - **Library Used**: [LLaMA.cpp](https://github.com/ggerganov/llama.cpp)

### Prompt Template ``` <|start_header_id|>system<|end_header_id|> { SYS Prompt }<|eot_id|><|start_header_id|>user<|end_header_id|> { USER Prompt }<|eot_id|><|start_header_id|>assistant<|end_header_id|> { ASSIST Prompt }<|eot_id|> ````

## ExLlamaV2 [ExLlamaV2](https://github.com/turboderp/exllamav2), a great tool that helps us easily Quantize your model in **EXL2 format**. ## Citation instructions ```bibtex @misc{polignano2024advanced, title={Advanced Natural-based interaction for the ITAlian language: LLaMAntino-3-ANITA}, author={Marco Polignano and Pierpaolo Basile and Giovanni Semeraro}, year={2024}, eprint={2405.07101}, archivePrefix={arXiv}, primaryClass={cs.CL} } ``` ```bibtex @misc{basile2023llamantino, title={LLaMAntino: LLaMA 2 Models for Effective Text Generation in Italian Language}, author={Pierpaolo Basile and Elio Musacchio and Marco Polignano and Lucia Siciliani and Giuseppe Fiameni and Giovanni Semeraro}, year={2023}, eprint={2312.09993}, archivePrefix={arXiv}, primaryClass={cs.CL} } ``` ```bibtex @article{llama3modelcard, title={Llama 3 Model Card}, author={AI@Meta}, year={2024}, url = {https://github.com/meta-llama/llama3/blob/main/MODEL_CARD.md} } ```