caesar-one's picture
Update README.md
c1a735c verified
---
license: apache-2.0
base_model: mistralai/Mistral-7B-Instruct-v0.2
tags:
- trl
- sft
- generated_from_trainer
datasets:
- generator
model-index:
- name: DanteLLM_instruct_7b-v0.2-boosted
results: []
library_name: peft
---
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->
## DanteLLM
DanteLLM is a Large Language Model developed in Sapienza lab.
In October 2023 we submitted a paper called DanteLLM: Let's Push Italian LLM Research Forward!
๐ŸคŒ ๐Ÿ‡ฎ๐Ÿ‡น
That paper got accepted with the scores 5, 4, 4 out of 5
## How to run the model
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
device = "cuda" # the device to load the model onto
model_id = "rstless-research/DanteLLM-7B-Instruct-Italian-v0.1"
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto", load_in_8bit=True)
tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)
model.eval()
messages = [
{"role": "user", "content": "Ciao chi sei?"},
{"role": "assistant", "content": "Ciao, sono DanteLLM, un large language model. Come posso aiutarti?"},
{"role": "user", "content": "Quanto dista la Terra dalla Luna?"}
]
encodeds = tokenizer.apply_chat_template(messages, return_tensors="pt")
model_inputs = encodeds.to(device)
model.to(device)
generated_ids = model.generate(input_ids=model_inputs, max_new_tokens=300, do_sample=True, temperature=0.3)
decoded = tokenizer.batch_decode(generated_ids)
print(decoded[0])
# La Terra si trova a 384,400 chilometri (238,855 miglia) dalla Luna. La distanza varia leggermente a causa della sua orbita ellittica.
```
# Authors
- Andrea Bacciu* (work done prior joining Amazon)
- Cesare Campagnano*
- Giovanni Trappolini
- Prof. Fabrizio Silvestri
\* Equal contribution