You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Clinical NER model using spaCy's SpanCategorizer implementation and medBERT.de.

Usage:

!huggingface-cli download phlobo/de_ggponc_medbertde de_ggponc_medbertde-1.0.0-py3-none-any.whl --local-dir .
!pip install de_ggponc_medbertde-1.0.0-py3-none-any.whl

import spacy
nlp = spacy.load('de_ggponc_medbertde')
d = nlp("allein nach Versagen einer Behandlung mit Oxaliplatin und Irinotecan")
for e in d.spans['entities']:
  print(e, e.label_)

yields:

Oxaliplatin Clinical_Drug
Irinotecan Clinical_Drug
Versagen einer Behandlung Other_Finding
Behandlung mit Oxaliplatin und Irinotecan Therapeutic

The model has been trained on gold standard labels in GGPONC 2.0 (https://aclanthology.org/2022.lrec-1.389/).

It detects the following 8 entity classes:

  • Findings: Diagnosis / Pathology and Other Findings
  • Substances: Clinical Drug, Nutrients / Body Substances, External Substances
  • Procedures: Therapeutic, Diagnostic

The configuration for training the model is available here: https://github.com/hpi-dhc/ggponc

When using the model, please cite the following publication:

@inproceedings{borchert-etal-2022-ggponc,
    title = "{GGPONC} 2.0 - The {G}erman Clinical Guideline Corpus for Oncology: Curation Workflow, Annotation Policy, Baseline {NER} Taggers",
    author = "Borchert, Florian  and
      Lohr, Christina  and
      Modersohn, Luise  and
      Witt, Jonas  and
      Langer, Thomas  and
      Follmann, Markus  and
      Gietzelt, Matthias  and
      Arnrich, Bert  and
      Hahn, Udo  and
      Schapranow, Matthieu-P.",
    booktitle = "Proceedings of the Thirteenth Language Resources and Evaluation Conference",
    month = jun,
    year = "2022",
    address = "Marseille, France",
    publisher = "European Language Resources Association",
    pages = "3650--3660"
}
Feature Description
Name de_ggponc_medbertde
Version 1.0.0
spaCy >=3.4.4,<3.5.0
Default Pipeline transformer, morphologizer, parser, transformer_spancat, spancat
Components transformer, morphologizer, parser, transformer_spancat, spancat
License The model may be used for non-commercial research activities only, see also the Terms of Use of GGPONC: https://www.leitlinienprogramm-onkologie.de/projekte/ggponc-english
Author Florian Borchert
Downloads last month
2
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train phlobo/de_ggponc_medbertde

Evaluation results