--- license: cc-by-4.0 language: - es tags: - biomedical - clinical - ner metrics: - f1 widget: - text: "Se realizó angiotomografía urgente de arterias pulmonares, que mostró tromboembolia pulmonar bilateral con dilatación ventricular derecha, además de opacidades periféricas parcheadas compatibles con neumonía por SARS-CoV-2, que se confirmó en la PCR." example_title: "COVID-19" - text: "El paciente presenta HTA en tratamiento con IECA y alfa-bloqueante, artritis reumatoide en tratamiento con corticoesteroide oral." example_title: "Oncology" - text: "Otros antecedentes de importancia son la captura de 30 insectos dentro de la vivienda, de los cuales tres fueron positivos a la infección por Trypanosoma cruzi y las características de la vivienda con materiales de construcción considerados de riesgo para la presencia del transmisor" example_title: "Tropical medicine" - text: "Tras la evaluación de la paciente por medio de exploración psicopatológica, la orientación diagnóstica es de trastorno adaptativo tipo mixto." example_title: "Psychiatry" - text: "Los hallazgos descritos son compatibles con quiste braquial del segundo arco complicado con proceso inflamatorio - infeccioso, sin poder descartar proceso maligno subyacente." example_title: "Otorhinolaryngology" --- # Disease mention recognizer for Spanish clinical texts 🦠🔬 This model derives from participation of SINAI team in [DISease TExt Mining Shared Task (DISTEMIST)](https://temu.bsc.es/distemist/). The DISTEMIST-entities subtrack required automatically finding disease mentions in clinical cases. Taking into account the length of clinical texts in the dataset, we opted for a sentence-level NER approach based on fine-tuning of a [RoBERTa model pre-trained on Spanish biomedical corpora](https://huggingface.co/PlanTL-GOB-ES/bsc-bio-es). # Evaluation and results Using the biomedical model on EHRs can be considered as cross-domain experiment and the fact that our biomedical system exhibits encouraging results on the NER task highlights the existence of domain transfer potential between biomedical and clinical fields. Table below summarizes the official micro-average scores obtained by this model during the official evaluation. Team standings are available [here](http://participants-area.bioasq.org/results/DisTEMIST/). | Precision | Recall | F1-score | |-----------|--------|----------| | 0.7520 | 0.7259 | 0.7387 | # System description paper and citation System description [paper](http://www.dei.unipd.it/~ferro/CLEF-WN-Drafts/CLEF2022/paper-17.pdf) is published in proceedings of 10th BioASQ Workshop, which will be held as a Lab in CLEF 2022 on September 5-8, 2022: ```bibtex: @inproceedings{ChizhikovaEtAl:CLEF2022, title = {SINAI at CLEF 2022: Leveraging biomedical transformers to detect and normalize disease mentions}, author = {Mariia Chizhikova and Jaime Collado-Montañéz and Pilar López-Úbeda and Manuel C. Díaz-Galiano and L. Alfonso Ureña-López and M. Teresa Martín-Valdivia}, pages = {265--273}, url = {http://ceur-ws.org/Vol-XXX/#paper-17}, crossref = {CLEF2022}} ```