Edit model card

GPT2-Medium pre-trained on cleaned Dutch mC4 🇳🇱



  • Tokenizer trained on mC4 with scripts from the Huggingface Transformers Flax examples

Training details:

  • Trained for 320k steps (30 dec 2021)
  • Block size: 512
  • Optimizer: adam, lr 8e-4, beta1 0.9, beta2 0.98
  • Warmup steps: 5000
  • Weight decay: 0.01

Further fine-tuned on a Dutch book corpus.

Work in progress. Dec 2021-Jan2022

Downloads last month

Dataset used to train yhavinga/gpt2-medium-dutch-nedd

Space using yhavinga/gpt2-medium-dutch-nedd 1