Turmbücher LM
This repository contains the language models (forward & backward) that were used to train the Turmbücher NER.
Two models for premodern German trained by Ismail Prada Ziegler as part of a research project at the University of Bern, Digital Humanities.
We recommend using flairs stacked embeddings for best effect.
Data Set
Main data set: Berner Turmbücher, early volumes from 16th C., Early New High German, 61k tokens training data.
Secondary data sets:
- SSRQ - Fribourg, 59k tokens.
- Chorgerichtsmanuale (unpublished), 76k tokens.
- Königsfelden Charters, 623k tokens.
- Talgerichtsprotokolle (unpublished), 438k tokens.
- Downloads last month
- 7
Inference API (serverless) does not yet support flair models for this pipeline type.