swap-uniba
/

LLaMAntino-3-ANITA-8B-Inst-DPO-ITA

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

m-polignano-uniba commited on May 10, 2024

Commit

6493b34

·

verified ·

1 Parent(s): 17d173a

Update README.md

Files changed (1) hide show

README.md +2 -48

README.md CHANGED Viewed

@@ -43,7 +43,7 @@ wants to provide Italian NLP researchers with an improved model the for Italian
 | Model | HF    | EXL2  | GGUF  | AWQ  |
 |-------|-------|-------|-------|-------|
-| m-polignano-uniba/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA | [Link](https://huggingface.co/m-polignano-uniba/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA) | [Link](#) | [Link](#) | [Link](#) |
 <hr>
@@ -199,52 +199,6 @@ For direct use with `transformers`, you can easily get started with the followin
   ```
-### Unsloth
-For direct use with `unsloth`, you can easily get started with the following steps.
-- Firstly, you need to install unsloth via the command below with `pip`.
-  ```bash
-  pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
-  pip install --no-deps xformers trl peft accelerate bitsandbytes
-  ```
-- Initialize and optimize the model before use.
-  ```python
-  from unsloth import FastLanguageModel
-  import torch
-  base_model = "m-polignano-uniba/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA"
-  model, tokenizer = FastLanguageModel.from_pretrained(
-      model_name = base_model,
-      max_seq_length = 8192,
-      dtype = None,
-      load_in_4bit = True, # Change to `False` if you don't want to use 4bit quantization.
-  )
-  FastLanguageModel.for_inference(model)
-  ```
-- Right now, you can start using the model directly.
-  ```python
-  sys = "Sei un an assistente AI per la lingua Italiana di nome LLaMAntino-3 ANITA " \
-      "(Advanced Natural-based interaction for the ITAlian language)." \
-      " Rispondi nella lingua usata per la domanda in modo chiaro, semplice ed esaustivo."
-  messages = [
-      {"role": "system", "content": sys},
-      {"role": "user", "content": "Chi è Carlo Magno?"}
-  ]
-  prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
-  inputs = tokenizer(prompt, return_tensors="pt", add_special_tokens=False)
-  for k,v in inputs.items():
-      inputs[k] = v.cuda()
-  outputs = model.generate(**inputs, max_new_tokens=512, do_sample=True, top_p=0.9, temperature=0.6)
-  results = tokenizer.batch_decode(outputs)[0]
-  print(results)
-  ```
 <hr>
 ## Evaluation
@@ -264,7 +218,7 @@ Evaluated with lm-evaluation-benchmark-harness for the [**Open Italian LLMs Lead
 | Hellaswag_IT    | 0.7093 |
 | MMLU_IT          | 0.5672 |
 ## Unsloth

 | Model | HF    | EXL2  | GGUF  | AWQ  |
 |-------|-------|-------|-------|-------|
+| m-polignano-uniba/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA | [Link](https://huggingface.co/m-polignano-uniba/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA) | [Soon](#) | [Soon](#) | [Soon](#) |
 <hr>
   ```
 <hr>
 ## Evaluation
 | Hellaswag_IT    | 0.7093 |
 | MMLU_IT          | 0.5672 |
+<hr>
 ## Unsloth