Acknowledge terms and conditions to accept the repository

Our team may take 2-3 days to process your request

This is a pretrained model that should be fine-tuned to perform downstream tasks. You agree to not use the model to conduct experiments that cause harm to human subjects, or to perform any medical-related task.

Igea-1B-v0.0.1 ⚕️🩺

Igea is a biomedical Small Language Model (SLM) for Italian, continually pretrained from Minerva with NMT translated Pubmed Abstracts

🔓: Access to the model is only granted after explicitly acknowledging that you have read the 'Bias, Risk, and Limitation' section of this model card.

This is ongoing research. Do not use it for any medical-related tasks.

Preprint: Igea: a Decoder-Only Language Model for Biomedical Text Generation in Italian.

How to use Igea with Hugging Face transformers

import transformers
import torch

model_id = "bmi-labmedinfo/Igea-1B-v0.1"

# Initialize the pipeline.
pipeline = transformers.pipeline(
    "text-generation",
    model=model_id,
    model_kwargs={"torch_dtype": torch.bfloat16},
    device_map="auto",
)

# Input text for the model.
input_text = "Il fegato è "

# Compute the outputs.
output = pipeline(
  input_text,
  max_new_tokens=128,
)

# Output:
# [{'generated_text': "Il fegato è una ghiandola fondamentale per il metabolismo umano, la più [...]"}]

🚨⚠️🚨 Bias, Risks, and Limitations 🚨⚠️🚨

This section identifies foreseeable harms and misunderstandings.

This is a continued pretraining of a foundation model, not subject to alignment. Model may:

Overrepresent some viewpoints and underrepresent others
Contain stereotypes
Contain personal information
Generate:
- Racist and sexist content
- Hateful, abusive, or violent language
- Discriminatory or prejudicial language
- Content that may not be appropriate for all settings, including sexual content
Make errors, including producing incorrect information or historical facts as if it were factual
Generate irrelevant or repetitive outputs

We are aware of the biases and potential problematic/toxic content that current pretrained large language models exhibit: more specifically, as probabilistic models of (Italian and English) languages, they reflect and amplify the biases of their training data.

The biomedical setting poses additional threats, including:

Disparities in research focus, demographic representation, and reporting standards
Reinforcement of existing medical paradigms and overlook emerging or alternative viewpoints, hindering innovation and comprehensive care
Generation of incorrect information and false claims, potentially leading to incorrect medical decisions

This model is therefore not intended to be used as it is for any medical-related task.

Training and evaluation data

It achieves the following results on the evaluation set:

Loss: 1.6976
Accuracy: 0.6011

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 4
gradient_accumulation_steps: 2
total_train_batch_size: 64
total_eval_batch_size: 32
optimizer: Adam with betas=(0.9,0.95) and epsilon=1e-08
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.02
num_epochs: 1

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
1.8964	0.0989	5000	1.8924	0.5713
1.8265	0.1978	10000	1.8264	0.5809
1.7883	0.2966	15000	1.7892	0.5866
1.7652	0.3955	20000	1.7626	0.5905
1.7415	0.4944	25000	1.7418	0.5939
1.7259	0.5933	30000	1.7253	0.5965
1.7106	0.6922	35000	1.7126	0.5985
1.703	0.7910	40000	1.7037	0.6000
1.6969	0.8899	45000	1.6989	0.6009
1.6963	0.9888	50000	1.6976	0.6011

Framework versions

Transformers 4.40.2
Pytorch 2.3.0+cu121
Datasets 2.19.1
Tokenizers 0.19.1

Recommendations

Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.

Evaluation

Evaluation results in terms of normalized accuracy for the Igea models on biomedical and general datasets, translated in Italian. The best performing checkpoint of Minerva has been included for comparison.

Dataset	Domain	Minerva 3B (best base)	Igea 350M	Igea 1B	Igea 3B
MedMCQA-ITA (0-shot)	Biomed	0.293	0.250	0.307	0.313
Hellaswag-IT (0-shot)	General	0.519	0.303	0.357	0.491
ARC-IT (0-shot)	General	0.305	0.244	0.270	0.287
MMLU-IT (5-shot)	General	0.261	0.254	0.255	0.252

Credits

Developed by Tommaso M. Buonocore and Simone Rancati.

Thanks to Michele Montebovi for his precious advices.

Downloads last month: 0

Safetensors

Model size

1.01B params

Tensor type

F32

Inference Examples

Text Generation

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for bmi-labmedinfo/Igea-1B-v0.1

Base model

sapienzanlp/Minerva-1B-base-v1.0

Finetuned

(1)

this model

Finetunes

1 model

Quantizations

1 model

Datasets used to train bmi-labmedinfo/Igea-1B-v0.1

Spaces using bmi-labmedinfo/Igea-1B-v0.1 2

Collection including bmi-labmedinfo/Igea-1B-v0.1

Igea Models

Collection

5 items • Updated Jul 24 • 3