license: mit | |
language: | |
- ru | |
metrics: | |
- perplexity | |
pipeline_tag: text-generation | |
This model was created by [ilnikolaev](https://huggingface.co/ilnikolaev) | |
Trained from scratch using Tensorflow Keras | |
[200mb Russian Comments from 2ch](https://www.kaggle.com/datasets/fizzzgen/65mb-of-dvach-conversations) dataset used | |
- Type: decoder-only | |
- Tokenizer: BPE | |
- Vocabulary size: 32000 | |
- Max sequence length: 120 | |
- Hidden size: 768 | |
- FFN size: 3072 | |
- Attention heads: 24 | |
- Decoder layers: 4 |