File size: 1,037 Bytes
49bca01
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
---
language: nl
widget:
- text: "Een zalig kerstfeest en "
- text: "Na een lange reeks vertragingen zal eind volgende week de James Webb Space Telescope (JWST) de aarde verlaten. Met een vergulde spiegel van "
tags:
- gpt2-medium
- gpt2
pipeline_tag: text-generation
datasets:
- yhavinga/mc4_nl_cleaned
---
# GPT2-Medium pre-trained on cleaned Dutch mC4 🇳🇱

Training details:

* trained for 120k steps (24 dec 2021)
* block size: 512
* optimizer: adam, lr 8e-4, beta1 0.9, beta2 0.98
* warmup 5000 steps
* weight decay 0.01

Work in progress. Dec 2021.

* Many thanks to the [Google TPU Research Cloud](https://sites.research.google/trc/about/) for providing access to a TPU cluster!
* Thanks to @gsarti for creating the [t5-flax-gcp
  repository](https://github.com/gsarti/t5-flax-gcp).
* Also thanks to the creators of [gpt2-medium-persian](https://huggingface.co/flax-community/gpt2-medium-persian) and
  [gpt2-medium-indonesian](https://huggingface.co/flax-community/gpt2-medium-persian)
  for sharing their training scripts!