Update README.md
Browse files
README.md
CHANGED
@@ -3,5 +3,27 @@ language:
|
|
3 |
- fi
|
4 |
pipeline_tag: text-generation
|
5 |
---
|
|
|
6 |
|
7 |
-
GPT-3
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
3 |
- fi
|
4 |
pipeline_tag: text-generation
|
5 |
---
|
6 |
+
Generative Pretrained Transformer with 8B parameteres for Finnish.
|
7 |
|
8 |
+
TurkuNLP Finnish GPT-3-models are a model family of pretrained monolingual GPT-style language models that are based on BLOOM-architecture.
|
9 |
+
Note that the models are pure language models, meaning that they are not [instruction finetuned](https://arxiv.org/abs/2203.02155) for dialogue
|
10 |
+
or answering questions.
|
11 |
+
|
12 |
+
These models are intended to be used as foundational models that can be e.g. instruction finetuned to serve as modern chat-models.
|
13 |
+
|
14 |
+
|
15 |
+
|
16 |
+
**Parameters**
|
17 |
+
| Model | Layers | Dim | Heads | Params |
|
18 |
+
|--------|--------|------|-------|--------|
|
19 |
+
| Small | 12 | 768 | 12 | 186M |
|
20 |
+
| Medium | 24 | 1024 | 16 | 437M |
|
21 |
+
| Large | 24 | 1536 | 16 | 881M |
|
22 |
+
| XL | 24 | 2064 | 24 | 1.5B |
|
23 |
+
| ”3B” | 32 | 2560 | 32 | 2.8B |
|
24 |
+
| ”8B” | 32 | 4096 | 32 | 7.5B |
|
25 |
+
| "13B" | 40 | 5120 | 40 | 13.3B |
|
26 |
+
|
27 |
+
|
28 |
+
|
29 |
+
More documentation coming soon!
|