Update README.md
Browse files
README.md
CHANGED
@@ -8,8 +8,9 @@ language:
|
|
8 |
- is
|
9 |
---
|
10 |
# Model description
|
11 |
-
[AI Sweden](https://huggingface.co/AI-Sweden/)
|
12 |
-
[GPT-Sw3 126M](https://huggingface.co/AI-Sweden-Models/gpt-sw3-126m/) | [GPT-Sw3 356M](https://huggingface.co/AI-Sweden-Models/gpt-sw3-356m/) | [GPT-Sw3 1.3B](https://huggingface.co/AI-Sweden-Models/gpt-sw3-1.3b/) | [GPT-Sw3 6.7B](https://huggingface.co/AI-Sweden-Models/gpt-sw3-6.7b/) | [GPT-Sw3 20B](https://huggingface.co/AI-Sweden-Models/gpt-sw3-20b/) | [GPT-Sw3 40B](https://huggingface.co/AI-Sweden-Models/gpt-sw3-40b/)
|
|
|
13 |
|
14 |
GPT-SW3 is a collection of large decoder-only pretrained transformer language models that were developed by AI Sweden in collaboration with RISE and the WASP WARA for Media and Language. GPT-SW3 has been trained on a dataset containing 320B tokens in Swedish, Norwegian, Danish, Icelandic, English, and programming code. The model was pretrained using a causal language modeling (CLM) objective utilizing the NeMo Megatron GPT implementation.
|
15 |
|
@@ -74,7 +75,7 @@ Following Mitchell et al. (2018), we provide a model card for GPT-SW3.
|
|
74 |
- Model type: GPT-SW3 is a large decoder-only transformer language model.
|
75 |
- Information about training algorithms, parameters, fairness constraints or other applied approaches, and features: GPT-SW3 was trained with the NeMo Megatron GPT implementation.
|
76 |
- Paper or other resource for more information: N/A.
|
77 |
-
- License: [
|
78 |
- Where to send questions or comments about the model: nlu@ai.se
|
79 |
|
80 |
# Intended Use
|
@@ -103,7 +104,7 @@ Following Mitchell et al. (2018), we provide a model card for GPT-SW3.
|
|
103 |
|
104 |
- Books
|
105 |
- Litteraturbanken (https://litteraturbanken.se/)
|
106 |
-
- The Pile
|
107 |
|
108 |
- Articles
|
109 |
- Diva (https://www.diva-portal.org/)
|
|
|
8 |
- is
|
9 |
---
|
10 |
# Model description
|
11 |
+
[AI Sweden](https://huggingface.co/AI-Sweden-Models/)
|
12 |
+
[GPT-Sw3 126M](https://huggingface.co/AI-Sweden-Models/gpt-sw3-126m/) | [GPT-Sw3 356M](https://huggingface.co/AI-Sweden-Models/gpt-sw3-356m/) | [GPT-Sw3 1.3B](https://huggingface.co/AI-Sweden-Models/gpt-sw3-1.3b/) | [GPT-Sw3 6.7B](https://huggingface.co/AI-Sweden-Models/gpt-sw3-6.7b/) | [GPT-Sw3 6.7B v2](https://huggingface.co/AI-Sweden-Models/gpt-sw3-6.7b-v2/) | [GPT-Sw3 20B](https://huggingface.co/AI-Sweden-Models/gpt-sw3-20b/) | [GPT-Sw3 40B](https://huggingface.co/AI-Sweden-Models/gpt-sw3-40b/)
|
13 |
+
[GPT-Sw3 126M Instruct](https://huggingface.co/AI-Sweden-Models/gpt-sw3-126m-instruct/) | [GPT-Sw3 356M Instruct](https://huggingface.co/AI-Sweden-Models/gpt-sw3-356m-instruct/) | [GPT-Sw3 1.3B Instruct](https://huggingface.co/AI-Sweden-Models/gpt-sw3-1.3b-instruct/) | [GPT-Sw3 6.7B v2 Instruct](https://huggingface.co/AI-Sweden-Models/gpt-sw3-6.7b-v2-instruct/) | [GPT-Sw3 20B Instruct](https://huggingface.co/AI-Sweden-Models/gpt-sw3-20b-instruct/)
|
14 |
|
15 |
GPT-SW3 is a collection of large decoder-only pretrained transformer language models that were developed by AI Sweden in collaboration with RISE and the WASP WARA for Media and Language. GPT-SW3 has been trained on a dataset containing 320B tokens in Swedish, Norwegian, Danish, Icelandic, English, and programming code. The model was pretrained using a causal language modeling (CLM) objective utilizing the NeMo Megatron GPT implementation.
|
16 |
|
|
|
75 |
- Model type: GPT-SW3 is a large decoder-only transformer language model.
|
76 |
- Information about training algorithms, parameters, fairness constraints or other applied approaches, and features: GPT-SW3 was trained with the NeMo Megatron GPT implementation.
|
77 |
- Paper or other resource for more information: N/A.
|
78 |
+
- License: [LICENSE](https://huggingface.co/AI-Sweden-Models/gpt-sw3-356m/blob/main/LICENSE).
|
79 |
- Where to send questions or comments about the model: nlu@ai.se
|
80 |
|
81 |
# Intended Use
|
|
|
104 |
|
105 |
- Books
|
106 |
- Litteraturbanken (https://litteraturbanken.se/)
|
107 |
+
- The Pile
|
108 |
|
109 |
- Articles
|
110 |
- Diva (https://www.diva-portal.org/)
|