gpt-neo-1.3B-dutch / README.md
yhavinga's picture
Saving weights and log at step 940000
23d5c6c
metadata
language: nl
widget:
  - text: In het jaar 2030 zullen we
  - text: Toen ik gisteren volledig in de ban was van
  - text: >-
      Studenten en leraren van de Bogazici Universiteit in de Turkse stad
      Istanbul
  - text: In Israël was een strenge lockdown
tags:
  - gpt-neo-1.3B
  - gpt-neo
pipeline_tag: text-generation
datasets:
  - yhavinga/mc4_nl_cleaned

GPT Neo 1.3B pre-trained on cleaned Dutch mC4 🇳🇱

NB: Training in progress.

Dataset:

  • mC4 NL Cleaned
  • dataset config: tiny (3B tokens)
  • dataset config: large (24B tokens)

Tokenizer:

  • Tokenizer trained on mC4 with scripts from the Huggingface Transformers Flax examples

Training details:

  • Trained for 70K steps (batch size 64) to ppl 27 on mc4 nl tiny 1 epoch
  • Trained for 940K steps (batch size 16) to ppl 16.1 on mc4 nl full
  • Training continuing
  • Block size: 512
  • Optimizer: adafactor
  • lr: 5e-5
  • Warmup steps: 5000

Work in progress. Jan 2022