|
--- |
|
language: nl |
|
widget: |
|
- text: "Een zalig kerstfeest en " |
|
- text: "Na een lange reeks vertragingen zal eind volgende week de James Webb Space Telescope (JWST) de aarde verlaten. Met een vergulde spiegel van " |
|
tags: |
|
- adaption |
|
- recycled |
|
- gpt2-medium |
|
- gpt2 |
|
pipeline_tag: text-generation |
|
datasets: |
|
- yhavinga/mc4_nl_cleaned |
|
--- |
|
# GPT2-Medium pre-trained on cleaned Dutch mC4 🇳🇱 |
|
|
|
Training details: |
|
|
|
* trained for 120k steps (24 dec 2021) |
|
* block size: 512 |
|
* optimizer: adam, lr 8e-4, beta1 0.9, beta2 0.98 |
|
* warmup 5000 steps |
|
* weight decay 0.01 |
|
|
|
Work in progress. Dec 2021. |
|
|
|
* Many thanks to the [Google TPU Research Cloud](https://sites.research.google/trc/about/) for providing access to a TPU cluster! |
|
* Thanks to @gsarti for creating the [t5-flax-gcp |
|
repository](https://github.com/gsarti/t5-flax-gcp). |
|
* Also thanks to the creators of [gpt2-medium-persian](https://huggingface.co/flax-community/gpt2-medium-persian) and |
|
[gpt2-medium-indonesian](https://huggingface.co/flax-community/gpt2-medium-persian) |
|
for sharing their training scripts! |
|
|