metadata

language:
  - it
license: apache-2.0
tags:
  - text-generation-inference
  - text generation

Mistral-7B-v0.1 for Italian Language Text Generation

Overview

Mistral-7B-v0.1 is a state-of-the-art Large Language Model (LLM) specifically pre-trained for generating text. With its 7 billion parameters, it's built to excel in benchmarks and outperforms even some larger models like the Llama 2 13B.

Model Architecture

The Mistral-7B-v0.1 model is a transformer-based model that can handle a variety of tasks including but not limited to translation, summarization, and text completion. It's particularly designed for the Italian language and can be fine-tuned for specific tasks.

Quantized version

DeepMount00/Mistral-Ita-7b-GGUF

Unique Features for Italian

Tailored Vocabulary: The model's vocabulary is fine-tuned to encompass the nuances and diversity of the Italian language.
Enhanced Understanding: Mistral-7B is specifically trained to grasp and generate Italian text, ensuring high linguistic and contextual accuracy.

Capabilities

Vocabulary Size: 32,000 tokens, allowing for a broad range of inputs and outputs.
Hidden Size: 4,096 dimensions, providing rich internal representations.
Intermediate Size: 14,336 dimensions, which contributes to the model's ability to process and generate complex sentences.

How to Use

How to utilize my Mistral for Italian text generation

import transformers
from transformers import TextStreamer
import torch

model_name = "DeepMount00/Mistral-Ita-7b"

tokenizer = transformers.AutoTokenizer.from_pretrained(model_name)
model = transformers.LlamaForCausalLM.from_pretrained(model_name, torch_dtype=torch.bfloat16, device_map="auto").eval()

def stream(user_prompt):
    runtimeFlag = "cuda:0"
    system_prompt = ''
    B_INST, E_INST = "<s> [INST]", "[/INST]"
    prompt = f"{system_prompt}{B_INST}{user_prompt.strip()}\n{E_INST}"
    inputs = tokenizer([prompt], return_tensors="pt").to(runtimeFlag)
    streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)
    _ = model.generate(**inputs, streamer=streamer, max_new_tokens=300, temperature=0.0001,
                        repetition_penalty=1.2, eos_token_id=2, do_sample=True, num_return_sequences=1)

domanda = """Scrivi una funzione python che moltiplica per 2 tutti i valori della lista:"""
contesto = """
[-5, 10, 15, 20, 25, 30, 35]
"""

prompt = domanda + "\n" + contesto

stream(prompt)

Developer

[Michele Montebovi]