Mistral 7B fine-tuned on Quaero for Named Entity Recognition (Generative)
This is a LoRA adapter version of unsloth/mistral-7b-instruct-v0.3, fine-tuned on the Quaero French medical dataset using a generative approach to Named Entity Recognition (NER).
Task
The model was trained to extract entities from French biomedical sentences (medlines) using a structured, prompt-based format.
Tag | Description |
---|---|
DISO |
Diseases or health-related conditions |
ANAT |
Anatomical parts (organs, tissues, body regions, etc.) |
PROC |
Medical or surgical procedures |
DEVI |
Medical devices or instruments |
CHEM |
Chemical substances or medications |
LIVB |
Living beings (e.g. humans, animals, bacteria, viruses) |
GEOG |
Geographical locations (e.g. countries, regions) |
OBJC |
Physical objects not covered by other categories |
PHEN |
Biological processes (e.g. inflammation, mutation) |
PHYS |
Physiological functions (e.g. respiration, vision) |
I use <>
as a separator and the output format is :
TAG_1 entity_1 <> TAG_2 entity_2 <> ... <> TAG_n entity_n
Dataset
The original dataset is Quaero French Medical Corpus and I converted it to a JSON format for generative instruction-style training.
{
"input": "Etude de l'efficacité et de la tolérance de la prazosine à libération prolongée chez des patients hypertendus et diabétiques non insulinodépendants.",
"output": "DISO tolérance <> CHEM prazosine <> LIVB patients <> DISO hypertendus <> DISO diabétiques non insulinodépendants"
}
The QUAERO French Medical corpus features overlapping entity spans, including nested structures, for instance :
{
"input": "Cancer du pancréas",
"output": "DISO Cancer <> DISO Cancer du pancréas <> ANAT pancréas"
}
Evaluation
Evaluation was performed on the test split by comparing the predicted entity set against the ground truth annotations using exact (type, entity) matching.
Metric | Score |
---|---|
Precision | 0.6883 |
Recall | 0.7143 |
F1 Score | 0.7011 |
Other formats
This model is also available in the following formats:
16bit
→ yqnis/mistral-7b-quaeroGGUF Q5_K_M
→ yqnis/mistral-7b-quaero-gguf
This mistral model was trained 2x faster with Unsloth and Huggingface's TRL library.