Edit model card

Easy German GPT2 Model

A language model for german easy language ("leichte Sprache") based on German GPT-2 model

Model Details

Initialized using the weights of German GPT-2 model.
Then fine-tuned for one epoch on "leichte Sprache" corpora consisting of:

  • encyclopedia like data
  • news like data

Hyperparamters used for fine-tuning:

  • tokenizer:

    • max_length: 1024 (but trained with dynamic length, using the collator functions 'pad_to_multiple_of=8')
    • stride: 64
    • return_overflowing_tokens=True
  • training arguments

    • num_train_epochs=1
    • learning_rate=1e-3
    • weight_decay=0.01
    • per_device_train_batch_size=4
    • gradient_accumulation_steps=4
    • warmup_steps=200
    • fp16=True

→ 25112 training items, trained on google colab GPU (30 min)

Evaluation results

The perplexity value is calculated based on an unseen dataset containing manually aligned standard german and "leichte Sprache" texts.
For calculation, the method described in this tutorial was used with the following values:

  • max_length = 512
  • stride = 256

For comparison: running the modified function on this example gives us a perplexity score of 18.2551

Model Perplexity "leichte Sprache" (Easy MDR News)) Perplexity standard german (Standard MDR News)
German GPT-2 model 23.8257 24.0301
our model 17.3053 48.6314
Downloads last month
12