|
--- |
|
license: apache-2.0 |
|
base_model: mistralai/Mistral-7B-Instruct-v0.2 |
|
tags: |
|
- trl |
|
- sft |
|
- generated_from_trainer |
|
datasets: |
|
- generator |
|
model-index: |
|
- name: DanteLLM_instruct_7b-v0.2-boosted |
|
results: [] |
|
library_name: peft |
|
--- |
|
|
|
<!-- This model card has been generated automatically according to the information the Trainer had access to. You |
|
should probably proofread and complete it, then remove this comment. --> |
|
## DanteLLM |
|
|
|
DanteLLM is a Large Language Model developed in Sapienza lab. |
|
In October 2023 we submitted a paper called DanteLLM: Let's Push Italian LLM Research Forward! |
|
๐ค ๐ฎ๐น |
|
|
|
|
|
That paper got accepted with the scores 5, 4, 4 out of 5 |
|
## How to run the model |
|
```python |
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
device = "cuda" # the device to load the model onto |
|
|
|
model_id = "rstless-research/DanteLLM-7B-Instruct-Italian-v0.1" |
|
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto", load_in_8bit=True) |
|
tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path) |
|
|
|
model.eval() |
|
|
|
messages = [ |
|
{"role": "user", "content": "Ciao chi sei?"}, |
|
{"role": "assistant", "content": "Ciao, sono DanteLLM, un large language model. Come posso aiutarti?"}, |
|
{"role": "user", "content": "Quanto dista la Terra dalla Luna?"} |
|
] |
|
|
|
encodeds = tokenizer.apply_chat_template(messages, return_tensors="pt") |
|
|
|
model_inputs = encodeds.to(device) |
|
model.to(device) |
|
|
|
generated_ids = model.generate(input_ids=model_inputs, max_new_tokens=300, do_sample=True, temperature=0.3) |
|
decoded = tokenizer.batch_decode(generated_ids) |
|
print(decoded[0]) |
|
# La Terra si trova a 384,400 chilometri (238,855 miglia) dalla Luna. La distanza varia leggermente a causa della sua orbita ellittica. |
|
``` |
|
|
|
# Authors |
|
- Andrea Bacciu* (work done prior joining Amazon) |
|
- Cesare Campagnano* |
|
- Giovanni Trappolini |
|
- Prof. Fabrizio Silvestri |
|
|
|
\* Equal contribution |