Model Card for Model ID

GPT-2 based model trained for Lithuanian.

Model Description

The model architecture is copied from the ai-forever/mGPT model, however it is trained from scratch on a modified partition of the Lithuanian partition of the mC4 dataset.

The training was done on Vilnius University supercomputer.

Downloads last month
26
Safetensors
Model size
1.42B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for domce20/GPT2-Lithuanian

Quantizations
1 model

Datasets used to train domce20/GPT2-Lithuanian