Edit model card

You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

hmTEAMS

🤗

Historic Multilingual and Monolingual TEAMS Models. The following languages are covered:

  • English (British Library Corpus - Books)
  • German (Europeana Newspaper)
  • French (Europeana Newspaper)
  • Finnish (Europeana Newspaper, Digilib)
  • Swedish (Europeana Newspaper, Digilib)
  • Dutch (Delpher Corpus)
  • Norwegian (NCC Corpus)

Architecture

We pretrain a "Training ELECTRA Augmented with Multi-word Selection" (TEAMS) model:

hmTEAMS Overview

Results

We perform experiments on various historic NER datasets, such as HIPE-2022 or ICDAR Europeana. All details incl. hyper-parameters can be found here.

Small Benchmark

We test our pretrained language models on various datasets from HIPE-2020, HIPE-2022 and Europeana. The following table shows an overview of used datasets.

Language Dataset Additional Dataset
English AjMC -
German AjMC -
French AjMC ICDAR-Europeana
Finnish NewsEye -
Swedish NewsEye -
Dutch ICDAR-Europeana -

Results

Model English AjMC German AjMC French AjMC Finnish NewsEye Swedish NewsEye Dutch ICDAR French ICDAR Avg.
hmBERT (32k) Schweter et al. 85.36 ± 0.94 89.08 ± 0.09 85.10 ± 0.60 77.28 ± 0.37 82.85 ± 0.83 82.11 ± 0.61 77.21 ± 0.16 82.71
hmTEAMS (Ours) 86.41 ± 0.36 88.64 ± 0.42 85.41 ± 0.67 79.27 ± 1.88 82.78 ± 0.60 88.21 ± 0.39 78.03 ± 0.39 84.11

Release

Our pretrained hmTEAMS model can be obtained from the Hugging Face Model Hub. Because of complicated license issues (that needs to be figured out), the model is only available by requesting access from Model Hub:

Acknowledgements

We thank Luisa März, Katharina Schmid and Erion Çano for their fruitful discussions about Historic Language Models.

Research supported with Cloud TPUs from Google's TPU Research Cloud (TRC). Many Thanks for providing access to the TPUs ❤️

Downloads last month
0
Unable to determine this model’s pipeline type. Check the docs .