--- license: apache-2.0 base_model: mistralai/Mistral-7B-Instruct-v0.2 tags: - trl - sft - generated_from_trainer datasets: - generator model-index: - name: DanteLLM_instruct_7b-v0.2-boosted results: [] library_name: peft --- ## DanteLLM DanteLLM is a Large Language Model developed in Sapienza lab. In October 2023 we submitted a paper called DanteLLM: Let's Push Italian LLM Research Forward! 🤌 🇮🇹 That paper got accepted with the scores 5, 4, 4 out of 5 ## How to run the model ```python from transformers import AutoTokenizer, AutoModelForCausalLM device = "cuda" # the device to load the model onto model_id = "rstless-research/DanteLLM-7B-Instruct-Italian-v0.1" model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto", load_in_8bit=True) tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path) model.eval() messages = [ {"role": "user", "content": "Ciao chi sei?"}, {"role": "assistant", "content": "Ciao, sono DanteLLM, un large language model. Come posso aiutarti?"}, {"role": "user", "content": "Quanto dista la Terra dalla Luna?"} ] encodeds = tokenizer.apply_chat_template(messages, return_tensors="pt") model_inputs = encodeds.to(device) model.to(device) generated_ids = model.generate(input_ids=model_inputs, max_new_tokens=300, do_sample=True, temperature=0.3) decoded = tokenizer.batch_decode(generated_ids) print(decoded[0]) # La Terra si trova a 384,400 chilometri (238,855 miglia) dalla Luna. La distanza varia leggermente a causa della sua orbita ellittica. ``` # Authors - Andrea Bacciu* (work done prior joining Amazon) - Cesare Campagnano* - Giovanni Trappolini - Prof. Fabrizio Silvestri \* Equal contribution