yhavinga
/

gpt2-medium-dutch-nedd

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

Edit model card

GPT2-Medium pre-trained on cleaned Dutch mC4 🇳🇱

Datasets:

mC4 NL Cleaned, dataset config: full (33B tokens)
A recreation of the TBC but for the Dutch language (see e.g. https://github.com/sgraaf/Replicate-Toronto-BookCorpus)

Tokenizer:

Tokenizer trained on mC4 with scripts from the Huggingface Transformers Flax examples

Training details:

Trained for 320k steps (30 dec 2021)
Block size: 512
Optimizer: adam, lr 8e-4, beta1 0.9, beta2 0.98
Warmup steps: 5000
Weight decay: 0.01

Further fine-tuned on a Dutch book corpus.

Work in progress. Dec 2021-Jan2022

Many thanks to the Google TPU Research Cloud for providing access to a TPU cluster!
Thanks to @gsarti for creating the t5-flax-gcp repository.
Also thanks to the creators of gpt2-medium-persian and gpt2-medium-indonesian for sharing their training scripts!

Downloads last month: 11

Inference API

Text Generation

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train yhavinga/gpt2-medium-dutch-nedd

Space using yhavinga/gpt2-medium-dutch-nedd 1