Back to all models
fill-mask mask_token: [MASK]
Query this model
🔥 This model is currently loaded and running on the Inference API. ⚠️ This model could not be loaded by the inference API. ⚠️ This model can be loaded on the Inference API on-demand.
JSON Output
API endpoint
								$
								curl -X POST \
-H "Authorization: Bearer YOUR_ORG_OR_USER_API_TOKEN" \
-H "Content-Type: application/json" \
-d '"json encoded string"' \
https://api-inference.huggingface.co/models/redewiedergabe/bert-base-historical-german-rw-cased
Share Copied link to clipboard

Monthly model downloads

redewiedergabe/bert-base-historical-german-rw-cased redewiedergabe/bert-base-historical-german-rw-cased
32 downloads
last 30 days

pytorch

tf

Contributed by

redewiedergabe Redewiedergabe
1 model

How to use this model directly from the 🤗/transformers library:

			
Copy to clipboard
from transformers import AutoTokenizer, AutoModelWithLMHead tokenizer = AutoTokenizer.from_pretrained("redewiedergabe/bert-base-historical-german-rw-cased") model = AutoModelWithLMHead.from_pretrained("redewiedergabe/bert-base-historical-german-rw-cased")

Model description

Dataset

Trained on fictional and non-fictional German texts written between 1840 and 1920:

Hardware used

1 Tesla P4 GPU

Hyperparameters

Parameter Value
Epochs 3
Gradient_accumulation_steps 1
Train_batch_size 32
Learning_rate 0.00003
Max_seq_len 128

Evaluation results: Automatic tagging of four forms of speech/thought/writing representation in historical fictional and non-fictional German texts

The language model was used in the task to tag direct, indirect, reported and free indirect speech/thought/writing representation in fictional and non-fictional German texts. The tagger is available and described in detail at https://github.com/redewiedergabe/tagger.

The tagging model was trained using the SequenceTagger Class of the Flair framework (Akbik et al., 2019) which implements a BiLSTM-CRF architecture on top of a language embedding (as proposed by Huang et al. (2015)).

Hyperparameters

Parameter Value
Hidden_size 256
Learning_rate 0.1
Mini_batch_size 8
Max_epochs 150

Results are reported below in comparison to a custom trained flair embedding, which was stacked onto a custom trained fastText-model. Both models were trained on the same dataset.

BERT FastText+Flair Test data
F1 Precision Recall F1 Precision Recall
Direct 0.80 0.86 0.74 0.84 0.90 0.79 historical German, fictional & non-fictional
Indirect 0.76 0.79 0.73 0.73 0.78 0.68 historical German, fictional & non-fictional
Reported 0.58 0.69 0.51 0.56 0.68 0.48 historical German, fictional & non-fictional
Free indirect 0.57 0.80 0.44 0.47 0.78 0.34 modern German, fictional

Intended use:

Historical German Texts (1840 to 1920)

(Showed good performance with modern German fictional texts as well)