Ekgren commited on
Commit
e08da52
1 Parent(s): b3ba0a4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -2
README.md CHANGED
@@ -8,8 +8,9 @@ language:
8
  - is
9
  ---
10
  # Model description
11
- [AI Sweden](https://huggingface.co/AI-Sweden/)
12
  [GPT-Sw3 126M](https://huggingface.co/AI-Sweden-Models/gpt-sw3-126m/) | [GPT-Sw3 356M](https://huggingface.co/AI-Sweden-Models/gpt-sw3-356m/) | [GPT-Sw3 1.3B](https://huggingface.co/AI-Sweden-Models/gpt-sw3-1.3b/) | [GPT-Sw3 6.7B](https://huggingface.co/AI-Sweden-Models/gpt-sw3-6.7b/) | [GPT-Sw3 20B](https://huggingface.co/AI-Sweden-Models/gpt-sw3-20b/) | [GPT-Sw3 40B](https://huggingface.co/AI-Sweden-Models/gpt-sw3-40b/)
 
13
 
14
  GPT-SW3 is a collection of large decoder-only pretrained transformer language models that were developed by AI Sweden in collaboration with RISE and the WASP WARA for Media and Language. GPT-SW3 has been trained on a dataset containing 320B tokens in Swedish, Norwegian, Danish, Icelandic, English, and programming code. The model was pretrained using a causal language modeling (CLM) objective utilizing the NeMo Megatron GPT implementation.
15
 
@@ -103,7 +104,7 @@ Following Mitchell et al. (2018), we provide a model card for GPT-SW3.
103
 
104
  - Books
105
  - Litteraturbanken (https://litteraturbanken.se/)
106
- - The Pile: Books S3
107
 
108
  - Articles
109
  - Diva (https://www.diva-portal.org/)
 
8
  - is
9
  ---
10
  # Model description
11
+ [AI Sweden](https://huggingface.co/AI-Sweden-Models/)
12
  [GPT-Sw3 126M](https://huggingface.co/AI-Sweden-Models/gpt-sw3-126m/) | [GPT-Sw3 356M](https://huggingface.co/AI-Sweden-Models/gpt-sw3-356m/) | [GPT-Sw3 1.3B](https://huggingface.co/AI-Sweden-Models/gpt-sw3-1.3b/) | [GPT-Sw3 6.7B](https://huggingface.co/AI-Sweden-Models/gpt-sw3-6.7b/) | [GPT-Sw3 20B](https://huggingface.co/AI-Sweden-Models/gpt-sw3-20b/) | [GPT-Sw3 40B](https://huggingface.co/AI-Sweden-Models/gpt-sw3-40b/)
13
+ [GPT-Sw3 126M](https://huggingface.co/AI-Sweden-Models/gpt-sw3-126m-instruct/) | [GPT-Sw3 356M](https://huggingface.co/AI-Sweden-Models/gpt-sw3-356m-instruct/) | [GPT-Sw3 1.3B](https://huggingface.co/AI-Sweden-Models/gpt-sw3-1.3b-instruct/) | [GPT-Sw3 6.7B](https://huggingface.co/AI-Sweden-Models/gpt-sw3-6.7b-v2-instruct/) | [GPT-Sw3 20B](https://huggingface.co/AI-Sweden-Models/gpt-sw3-20b-instruct/)
14
 
15
  GPT-SW3 is a collection of large decoder-only pretrained transformer language models that were developed by AI Sweden in collaboration with RISE and the WASP WARA for Media and Language. GPT-SW3 has been trained on a dataset containing 320B tokens in Swedish, Norwegian, Danish, Icelandic, English, and programming code. The model was pretrained using a causal language modeling (CLM) objective utilizing the NeMo Megatron GPT implementation.
16
 
 
104
 
105
  - Books
106
  - Litteraturbanken (https://litteraturbanken.se/)
107
+ - The Pile
108
 
109
  - Articles
110
  - Diva (https://www.diva-portal.org/)