eurogpt2 / README.md
malteos
language tags
25410e1
metadata
language:
  - bg
  - cs
  - da
  - de
  - el
  - en
  - es
  - et
  - fi
  - fr
  - ga
  - hr
  - hu
  - it
  - lt
  - lv
  - mt
  - nl
  - pl
  - pt
  - ro
  - sk
  - sl
  - sv
  - uk
  - multilingual
license: mit

EuroGPT2

NOTE: THIS IS THE ORIGINAL MEGATRON-DEEPSPEED CHECKPOINT INCLUDING OPTIMIZER STATES

A GPT2 language model for European languages (EU-24 + Ukrainian). The model follows the original architecture as OpenAI's GPT2 apart from using rotary instead of learned positional embeddigs.

Model settings

  • parameters: 124M
  • number of layers: 12
  • hidden size: 768
  • number of heads: 12
  • sequence length: 1024
  • batch size: 168
  • test PPL after training: 23.6 (steps: 436,940)

Training data

Languages

Included languages: Bulgarian, Czech, Danish, German, Greek, English, Spanish, Estonian, Finnish, French, Irish, Croatian, Hungarian, Italian, Lithuanian, Latvian, Maltese, Dutch, Polish, Portuguese, Romanian, Slovak, Slovenian, Swedish, and Ukrainian.

Language Ratio
bg 5,92%
cs 4,77%
da 2,19%
de 7,36%
el 8,60%
en 10,11%
es 6,57%
et 1,67%
fi 2,70%
fr 7,18%
ga 0,25%
hr 1,09%
hu 6,38%
it 5,80%
lt 2,01%
lv 1,76%
mt 1,49%
nl 5,20%
pl 4,82%
pt 4,64%
ro 2,93%
sk 2,03%
sl 1,54%
sv 3,00%

License

MIT