metadata
language:
- en
pipeline_tag: token-classification
Named Entity Recognition (NER) model to recognize chemical entities.
PubMedBERT fine-tuned on the following datasets:
- Chemdner patents CEMP corpus (train, dev, test sets)
- DDI corpus (train, dev, test sets): entity types "GROUP", "DRUG", "DRUG_N"
- GREC Corpus (train, dev, test sets): entity type "organic_compounds"
- MLEE (train, dev, test sets): entity type "Drug or compound"
- NLM-CHEM (train, dev, test sets)
- CHEMDNER (train, dev, test sets)
- Chebi Corpus (train, dev, test sets): entity types "Metabolite", "Chemical"
- PHAEDRA (train, dev, test sets): entity type "Pharmalogical_substance"
- Chemprot (train, dev, test sets)
- PGx Corpus (train, dev, test sets): entity type "Chemical"
- BioNLP11ID (train, dev, test sets): entity type "Chemical"
- BioNLP13CG (train, dev, test sets): entity type "Chemical"
- BC4CHEMD (train, dev, test sets)
- CRAFT corpus (train, dev, test sets): entity type "ChEBI"
- BC5CDR (train, dev, test sets): entity type "Chemical"