tartuNLP
/

gpt-for-est-large

Text Generation

Generated from Trainer

Inference Endpoints

text-generation-inference

Model card Files Files and versions Community

gpt-for-est-large / README.md

mphi's picture

Update README.md

a96a879 7 months ago

|

raw history blame contribute delete

No virus

1.07 kB

	---
	tags:
	- generated_from_trainer
	model-index:
	- name: gpt-est-large
	results: []

	widget:
	- text: ">wiki< mis on GPT? Vastus:"

	---

	# gpt-est-large

	This is the large-size [GPT2](https://huggingface.co/docs/transformers/model_doc/gpt2) model, trained from scratch on 2.2 billion words (Estonian National Corpus + News Crawl + Common Crawl). Previously named "gpt-4-est-large", renamed to avoid click-baiting.

	[Reference](https://doi.org/10.22364/bjmc.2022.10.3.19)

	### Format

	For training data was prepended with a text domain tag, and it should be added as prefix when using the model: >general<, >web<, >news<, >doaj< and >wiki< (standing for general texts, web crawled texts, news, article abstracts and wikipedia texts). Use the prefixes like this, e.g: ">web< Kas tead, et".

	### Model details

	- num. of layers: 24
	- num. of heads: 24
	- embedding size: 1536
	- context size: 1024
	- total size: 723.58M params

	Further details to be added soon.

	### Framework versions

	- Transformers 4.13.0.dev0
	- Pytorch 1.10.0+cu102
	- Datasets 1.15.1
	- Tokenizers 0.10.3