metadata
language:
- en
pipeline_tag: token-classification
license: apache-2.0
Named Entity Recognition (NER) model to recognize disease entities.
Please cite our work:
@article{NILNKER2022,
title = {NILINKER: Attention-based approach to NIL Entity Linking},
journal = {Journal of Biomedical Informatics},
volume = {132},
pages = {104137},
year = {2022},
issn = {1532-0464},
doi = {https://doi.org/10.1016/j.jbi.2022.104137},
url = {https://www.sciencedirect.com/science/article/pii/S1532046422001526},
author = {Pedro Ruas and Francisco M. Couto},
}
PubMedBERT fine-tuned on the following datasets:
- NCBI Disease Corpus (train and dev sets)
- PHAEDRA (train, dev, test sets): entity type "Disorder"
- Corpus for Disease Names and Adverse Effects (train, dev, test sets): entity types "DISEASE", "ADVERSE"
- RareDis corpus (train, dev, test sets): entity types "DISEASE", "RAREDISEASE", "SYMPTOM"
- CoMAGC (train, dev, test sets): entity type "cancer_term"
- PGxCorpus (train, dev, test sets):
- miRNA-Test-Corpus (train, dev, test sets): entity type "Diseases"
- BC5CDR (train and dev sets): entity type "Disease"
- Mantra (train, dev, test sets): entity type "DISO"