Edit model card

This model is a fine-tuned version of Leonard Konle's fiction-gbert.

It was fine-tuned for ten epochs on the Deutscher Roman Korpus (DROC) on literary character detection using a standard token-classification head. However, in deviation from most other models, this model detects named entities and nouns (matching both "Harry" and "Zauberer") referencing a character.

The model achieves a 92.12 / 89.98 % F1 score on the semi-official DROC validation and test sets.

The code to reproduce the dataset and training can be accessed via Github

Additional hyperparameters are:

  • Num Epochs: 10
  • Batch-Size: 8
  • Optimizer: AdamW
  • Learning rate: 2e-05
  • Weight-Decay: 0.1
  • Scheduler: Linear Warmup for the first 10 % of the training, with a linear decay for the remainder.
  • Precision: 32bit
  • Training-Framework: Trident

ID2Label-Map:

{
0: "O",
1: "B-PER",
2: "I-Per"
}
Downloads last month
42
Safetensors
Model size
335M params
Tensor type
F32
·