roberta-base-biomedical-clinical-es for QA

This model was trained as part of the "Extractive QA Biomedicine" project developed during the 2022 Hackathon organized by SOMOS NLP.

Motivation

Recent research has made available Spanish Language Models trained on Biomedical corpus. This project explores the use of these new models to generate extractive Question Answering models for Biomedicine, and compares their effectiveness with general masked language models.

The models trained during the Hackathon were:

hackathon-pln-es/roberta-base-bne-squad2-es

hackathon-pln-es/roberta-base-biomedical-clinical-es-squad2-es

hackathon-pln-es/roberta-base-biomedical-es-squad2-es

hackathon-pln-es/biomedtra-small-es-squad2-es

Description

This model is a fine-tuned version of PlanTL-GOB-ES/roberta-base-biomedical-clinical-es on the squad_es (v2) training dataset.

Hyperparameters

The hyperparameters were chosen based on those used in PlanTL-GOB-ES/roberta-base-bne-sqac, a spanish-based QA model trained on a dataset with SQUAD v1 fromat.

 --num_train_epochs 2
 --learning_rate 3e-5
 --weight_decay 0.01
 --max_seq_length 386
 --doc_stride 128 

Performance

Evaluated on the hackathon-pln-es/biomed_squad_es_v2 dev set.

Model Base Model Domain exact f1 HasAns_exact HasAns_f1 NoAns_exact NoAns_f1
hackathon-pln-es/roberta-base-bne-squad2-es General 67.6341 75.6988 53.7367 70.0526 81.2174 81.2174
hackathon-pln-es/roberta-base-biomedical-clinical-es-squad2-es Biomedical 66.8426 75.2346 53.0249 70.0031 80.3478 80.3478
hackathon-pln-es/roberta-base-biomedical-es-squad2-es Biomedical 67.6341 74.5612 47.6868 61.7012 87.1304 87.1304
hackathon-pln-es/biomedtra-small-es-squad2-es Biomedical 34.4767 44.3294 45.3737 65.307 23.8261 23.8261

Team

Santiago Maximo: smaximo

Downloads last month
15
Safetensors
Model size
125M params
Tensor type
I64
·
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Datasets used to train somosnlp-hackathon-2022/roberta-base-biomedical-clinical-es-squad2-es

Space using somosnlp-hackathon-2022/roberta-base-biomedical-clinical-es-squad2-es 1

Collection including somosnlp-hackathon-2022/roberta-base-biomedical-clinical-es-squad2-es