File size: 5,863 Bytes
bb589c7
 
7dfc9f9
 
 
 
bb589c7
9cda01f
 
7dfc9f9
9cda01f
 
 
7dfc9f9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
addf22e
7dfc9f9
 
 
 
 
 
 
addf22e
7dfc9f9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
addf22e
7dfc9f9
 
 
 
 
 
 
dd53ae4
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7dfc9f9
dd53ae4
7dfc9f9
 
 
addf22e
7dfc9f9
 
 
 
 
24a7b71
 
ebb4cdd
24a7b71
7dfc9f9
 
 
 
 
 
 
 
 
 
 
 
0610d6b
 
 
 
 
 
 
 
 
874153d
 
1f0e9e6
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
---
license: llama2
language:
- it
tags:
- text-generation-inference
---
<img src="https://i.ibb.co/6mHSRm3/llamantino53.jpg" alt="llamantino53" border="0" width="200px">

# Model Card for LLaMAntino-2-chat-13b-UltraChat-ITA
*Last Update: 08/01/2024*<br>
*Example of Use*: [Colab Notebook](https://colab.research.google.com/drive/1xUite70ANLQp8NwQE93jlI3epj_cpua7?usp=sharing)
<hr>

## Model description

<!-- Provide a quick summary of what the model is/does. -->

**LLaMAntino-2-chat-13b-UltraChat** is a *Large Language Model (LLM)* that is an instruction-tuned version of **LLaMAntino-2-chat-13b** (an italian-adapted **LLaMA 2 chat**). 
This model aims to provide Italian NLP researchers with an improved model for italian dialogue use cases.

The model was trained using *QLora* and using as training data [UltraChat](https://github.com/thunlp/ultrachat) translated to the italian language using [Argos Translate](https://pypi.org/project/argostranslate/1.4.0/). 
If you are interested in more details regarding the training procedure, you can find the code we used at the following link:
- **Repository:** https://github.com/swapUniba/LLaMAntino

**NOTICE**: the code has not been released yet, we apologize for the delay, it will be available asap!

- **Developed by:** Pierpaolo Basile, Elio Musacchio, Marco Polignano, Lucia Siciliani, Giuseppe Fiameni, Giovanni Semeraro
- **Funded by:** PNRR project FAIR - Future AI Research
- **Compute infrastructure:** [Leonardo](https://www.hpc.cineca.it/systems/hardware/leonardo/) supercomputer
- **Model type:** LLaMA-2-chat
- **Language(s) (NLP):** Italian
- **License:** Llama 2 Community License 
- **Finetuned from model:** [swap-uniba/LLaMAntino-2-chat-13b-hf-ITA](https://huggingface.co/swap-uniba/LLaMAntino-2-chat-13b-hf-ITA)

## Prompt Format

This prompt format based on the [LLaMA 2 prompt template](https://gpus.llm-utils.org/llama-2-prompt-template/) adapted to the italian language was used:

```python
" [INST]<<SYS>>\n" \
"Sei un assistente disponibile, rispettoso e onesto. " \
"Rispondi sempre nel modo piu' utile possibile, pur essendo sicuro. " \
"Le risposte non devono includere contenuti dannosi, non etici, razzisti, sessisti, tossici, pericolosi o illegali. " \
"Assicurati che le tue risposte siano socialmente imparziali e positive. " \
"Se una domanda non ha senso o non e' coerente con i fatti, spiegane il motivo invece di rispondere in modo non corretto. " \
"Se non conosci la risposta a una domanda, non condividere informazioni false.\n" \
"<</SYS>>\n\n" \
f"{user_msg_1}[/INST] {model_answer_1} </s> <s> [INST]{user_msg_2}[/INST] {model_answer_2} </s> ... <s> [INST]{user_msg_N}[/INST] {model_answer_N} </s>"
```

We recommend using the same prompt in inference to obtain the best results!

## How to Get Started with the Model

Below you can find an example of model usage:

```python
from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "swap-uniba/LLaMAntino-2-chat-13b-hf-UltraChat-ITA"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

user_msg = "Ciao! Come stai?"

prompt = " [INST]<<SYS>>\n" \
         "Sei un assistente disponibile, rispettoso e onesto. " \
         "Rispondi sempre nel modo piu' utile possibile, pur essendo sicuro. " \
         "Le risposte non devono includere contenuti dannosi, non etici, razzisti, sessisti, tossici, pericolosi o illegali. " \
         "Assicurati che le tue risposte siano socialmente imparziali e positive. " \
         "Se una domanda non ha senso o non e' coerente con i fatti, spiegane il motivo invece di rispondere in modo non corretto. " \
         "Se non conosci la risposta a una domanda, non condividere informazioni false.\n" \
         "<</SYS>>\n\n" \
         f"{user_msg}[/INST]"

pipe = transformers.pipeline(
    model=model,
    tokenizer=tokenizer,
    return_full_text=False, # langchain expects the full text
    task='text-generation',
    max_new_tokens=512, # max number of tokens to generate in the output
    temperature=0.8  #temperature for more or less creative answers
)

# Method 1
sequences = pipe(text)
for seq in sequences:
    print(f"{seq['generated_text']}")

# Method 2
input_ids = tokenizer(prompt, return_tensors="pt").input_ids
outputs = model.generate(input_ids=input_ids, max_length=512)
print(tokenizer.batch_decode(outputs.detach().cpu().numpy()[:, input_ids.shape[1]:], skip_special_tokens=True)[0])
```

If you are facing issues when loading the model, you can try to load it **Quantized**:

```python
model = AutoModelForCausalLM.from_pretrained(model_id, load_in_8bit=True)
```

*Note*:
1) The model loading strategy above requires the [*bitsandbytes*](https://pypi.org/project/bitsandbytes/) and [*accelerate*](https://pypi.org/project/accelerate/) libraries
2) The Tokenizer, by default, adds at the beginning of the prompt the '\<BOS\>' token. If that is not the case, add as a starting token the *\<s\>* string.
   
## Evaluation

<!-- This section describes the evaluation protocols and provides the results. -->

*Coming soon*!

## Citation

<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->

If you use this model in your research, please cite the following:

```bibtex
@misc{basile2023llamantino,
      title={LLaMAntino: LLaMA 2 Models for Effective Text Generation in Italian Language}, 
      author={Pierpaolo Basile and Elio Musacchio and Marco Polignano and Lucia Siciliani and Giuseppe Fiameni and Giovanni Semeraro},
      year={2023},
      eprint={2312.09993},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}
```

*Notice:* Llama 2 is licensed under the LLAMA 2 Community License, Copyright © Meta Platforms, Inc. All Rights Reserved. [*License*](https://ai.meta.com/llama/license/)