dvachGPT / README.md
ilnikolaev's picture
Update README.md
53ac067
metadata
license: mit
language:
  - ru
metrics:
  - perplexity
pipeline_tag: text-generation

This model was created by ilnikolaev

Trained from scratch using Tensorflow Keras

200mb Russian Comments from 2ch dataset used

  • Type: decoder-only
  • Tokenizer: BPE
  • Vocabulary size: 32000
  • Max sequence length: 120
  • Hidden size: 768
  • FFN size: 3072
  • Attention heads: 24
  • Decoder layers: 4