metadata
language:
- ru
license: apache-2.0
pipeline_tag: text-generation
BulgakovLM 3B
A language model trained on Russian. May be suitable for further tuning. The 100 gigabyte dataset consisted primarily of web pages, books, poems, and prose. The model was trained over 2 epochs.
Uses GPT-J architecture with a context window of 4k tokens.
Trained thanks to a TRC grant on TPU-VM v3-8