--- language: - en pipeline_tag: token-classification license: apache-2.0 --- Named Entity Recognition (NER) model to recognize chemical entities. Please cite our work: ``` @article{NILNKER2022, title = {NILINKER: Attention-based approach to NIL Entity Linking}, journal = {Journal of Biomedical Informatics}, volume = {132}, pages = {104137}, year = {2022}, issn = {1532-0464}, doi = {https://doi.org/10.1016/j.jbi.2022.104137}, url = {https://www.sciencedirect.com/science/article/pii/S1532046422001526}, author = {Pedro Ruas and Francisco M. Couto}, } ``` [PubMedBERT](https://huggingface.co/microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract-fulltext) fine-tuned on the following datasets: - [Chemdner patents CEMP corpus](https://biocreative.bioinformatics.udel.edu/resources/corpora/chemdner-patents-cemp-corpus/) (train, dev, test sets) - [DDI corpus](https://github.com/isegura/DDICorpus) (train, dev, test sets): entity types "GROUP", "DRUG", "DRUG_N" - [GREC Corpus](http://www.nactem.ac.uk/GREC/standoff.php) (train, dev, test sets): entity type "organic_compounds" - [MLEE](http://nactem.ac.uk/MLEE/) (train, dev, test sets): entity type "Drug or compound" - [NLM-CHEM](https://ftp.ncbi.nlm.nih.gov/pub/lu/NLMChem/) (train, dev, test sets) - [CHEMDNER](https://biocreative.bioinformatics.udel.edu/resources/) (train, dev, test sets) - [Chebi Corpus](http://www.nactem.ac.uk/chebi/) (train, dev, test sets): entity types "Metabolite", "Chemical" - [PHAEDRA](http://www.nactem.ac.uk/PHAEDRA/) (train, dev, test sets): entity type "Pharmalogical_substance" - [Chemprot](https://biocreative.bioinformatics.udel.edu/tasks/biocreative-vi/track-5/) (train, dev, test sets) - [PGx Corpus](https://github.com/practikpharma/PGxCorpus) (train, dev, test sets): entity type "Chemical" - [BioNLP11ID](https://github.com/cambridgeltl/MTL-Bioinformatics-2016/tree/master/data/BioNLP11ID-chem-IOB) (train, dev, test sets): entity type "Chemical" - [BioNLP13CG]() (train, dev, test sets): entity type "Chemical" - [BC4CHEMD](https://github.com/cambridgeltl/MTL-Bioinformatics-2016/tree/master/data/BC4CHEMD) (train, dev, test sets) - [CRAFT corpus](https://github.com/UCDenver-ccp/CRAFT/tree/master/concept-annotation) (train, dev, test sets): entity type "ChEBI" - [BC5CDR]() (train, dev, test sets): entity type "Chemical"