gpt-est-base
This is the base-size GPT2 model, trained from scratch on 2.2 billion words (Estonian National Corpus + News Crawl + Common Crawl) for 3 epochs. Previously named "gpt-4-est-base", renamed to avoid click-baiting.
Format
For training data was prepended with a text domain tag, and it should be added as prefix when using the model: >general<, >web<, >news<, >doaj< and >wiki< (standing for general texts, web crawled texts, news, article abstracts and wikipedia texts). Use the prefixes like this, e.g: ">web< Kas tead, et".
Model details
- num. of layers: 12
- num. of heads: 12
- embedding size: 768
- context size: 1024
- total size: 118.68M params
Further details to be added soon.
Framework versions
- Transformers 4.13.0.dev0
- Pytorch 1.10.0+cu102
- Datasets 1.15.1
- Tokenizers 0.10.3
- Downloads last month
- 137
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.