English NER model for extraction of named entities from scientific acknowledgement texts using Flair Embeddings

F1-Score: 0.79

Predicts 6 tags:

label	description	precision	recall	f1-score	support
GRNB	grant number	0,93	0,98	0,96	160
IND	person	0,98	0,98	0,98	295
FUND	funding organization	0,70	0,83	0,76	157
UNI	university	0,77	0,74	0,75	99
MISC	miscellaneous	0,65	0,65	0,65	82
COR	corporation	0,75	0,50	0,60	12

Based on Flair embeddings

Usage

Requires: Flair (pip install flair)

#import libraries
from flair.data import Sentence
from flair.models import SequenceTagger

# load the trained model
model = SequenceTagger.load("kalawinka/flair-ner-acknowledgments")

# create example sentence
sentence = Sentence("This work was supported by State Key Lab of Ocean Engineering Shanghai Jiao Tong University and financially supported by China National Scientific and Technology Major Project (Grant No. 2016ZX05028-006-009)")

# predict the tags
model.predict(sentence)
#print output as spans
for entity in sentence.get_spans('ner'):
    print(entity)

This produces the following output:

Span[5:15]: "State Key Lab of Ocean Engineering Shanghai Jiao Tong University" → UNI (0.9396)
Span[19:26]: "China National Scientific and Technology Major Project" → FUND (0.9865)
Span[29:30]: "2016ZX05028-006-009" → GRNB (0.9996)

You can try the model by copying the following acknowledgement text in the small text box on the right and click “Compute”:

The original work was funded by the German Center for Higher Education Research and Science Studies (DZHW) via the project "Mining Acknowledgement Texts in Web of Science (MinAck)". Access to the WoS data was granted via the Competence Centre for Bibliometrics. Data access was funded by BMBF (Federal Ministry of Education and Research, Germany) under grant number 01PQ17001. Nina Smirnova received funding from the German Research Foundation (DFG)  via the project "POLLUX". The present paper is an extended version of the paper "Evaluation of Embedding Models for Automatic Extraction and Classification of Acknowledged Entities in Scientific Documents" (Smirnova & Mayr, 2022) presented at the 3rd Workshop on Extraction and Evaluation of Knowledge Entities from Scientific Documents (EEKE2022).

For other examples also see our Google colab notebook

Citation

if you use this model, please consider citing this work:

@misc{smirnova2023embedding,
      title={Embedding Models for Supervised Automatic Extraction and Classification of Named Entities in Scientific Acknowledgements}, 
      author={Nina Smirnova and Philipp Mayr},
      year={2023},
      eprint={2307.13377},
      archivePrefix={arXiv},
      primaryClass={cs.DL}
}