Edit model card

Model Card for Minerva-3B-Instruct-v1.0

Minerva-3B-Instruct-v1.0 is an instruction-tuned version of the Minerva-3B-base-v1.0 model, specifically fine-tuned for understanding and following instructions in Italian.

Model Details

Model Description

Evaluation

For a detailed comparison of model performance, check out the Leaderboard for Italian Language Models.

Here's a breakdown of the performance metrics:

Model/metric hellaswag_it acc_norm arc_it acc_norm m_mmlu_it 5-shot acc Average
Minerva-3B-Instruct-v1.0 0.5197 0.3157 0.2631 0.366
Minerva-3B-base-v1.0 0.5187 0.3045 0.2612 0.361

Sample Code

  from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
  import torch
  torch.random.manual_seed(0)
  # Run text generation pipeline with our next model
  prompt = """Di seguito è riportata un'istruzione che descrive un'attività, abbinata ad un input che fornisce
  ulteriore informazione. Scrivi una risposta che soddisfi adeguatamente la richiesta.
  
  ### Istruzione:
  Suggerisci un'attività serale romantica

  ### Input:
  
  
  ### Risposta:"""
  
  model_id = "FairMind/Minerva-3B-Instruct-v1.0"
  tokenizer = AutoTokenizer.from_pretrained(model_id)
  model = AutoModelForCausalLM.from_pretrained(
      model_id, 
      device_map="cuda", 
      torch_dtype="auto", 
      trust_remote_code=True, 
  )
  
  generation_args = {
      "max_new_tokens": 500,
      "return_full_text": False,
      "temperature": 0.0,
      "do_sample": False,
  }
  
  pipe = pipeline(
      "text-generation",
      model=model,
      tokenizer=tokenizer,
  )
  
  output = pipe(prompt, **generation_args)
  print(output[0]['generated_text'])
Downloads last month
77
Safetensors
Model size
2.89B params
Tensor type
FP16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for FairMind/Minerva-3B-Instruct-v1.0

Finetuned
(2)
this model
Merges
2 models

Dataset used to train FairMind/Minerva-3B-Instruct-v1.0