dvachGPT / README.md
ilnikolaev's picture
Update README.md
53ac067
|
raw
history blame
494 Bytes
---
license: mit
language:
- ru
metrics:
- perplexity
pipeline_tag: text-generation
---
This model was created by [ilnikolaev](https://huggingface.co/ilnikolaev)
Trained from scratch using Tensorflow Keras
[200mb Russian Comments from 2ch](https://www.kaggle.com/datasets/fizzzgen/65mb-of-dvach-conversations) dataset used
- Type: decoder-only
- Tokenizer: BPE
- Vocabulary size: 32000
- Max sequence length: 120
- Hidden size: 768
- FFN size: 3072
- Attention heads: 24
- Decoder layers: 4