Text Generation
Transformers
PyTorch
Safetensors
gpt2
conversational
text-generation-inference
Inference Endpoints
Ekgren commited on
Commit
fd8ea96
1 Parent(s): 7b820a2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -4
README.md CHANGED
@@ -13,8 +13,9 @@ language:
13
  pipeline_tag: conversational
14
  ---
15
  # Model description
16
- [AI Sweden](https://huggingface.co/AI-Sweden/)
17
- [GPT-Sw3 126M instruct](https://huggingface.co/AI-Sweden-Models/gpt-sw3-126m-instruct/) | [GPT-Sw3 356M instruct](https://huggingface.co/AI-Sweden-Models/gpt-sw3-356m/) | [GPT-Sw3 1.3B instruct](https://huggingface.co/AI-Sweden-Models/gpt-sw3-1.3b-instruct/) | [GPT-Sw3 6.7B instruct](https://huggingface.co/AI-Sweden-Models/gpt-sw3-6.7b-instruct/) | [GPT-Sw3 20B instruct](https://huggingface.co/AI-Sweden-Models/gpt-sw3-20b-instruct/)
 
18
 
19
  GPT-SW3 is a collection of large decoder-only pretrained transformer language models that were developed by AI Sweden in collaboration with RISE and the WASP WARA for Media and Language. GPT-SW3 has been trained on a dataset containing 320B tokens in Swedish, Norwegian, Danish, Icelandic, English, and programming code. The model was pretrained using a causal language modeling (CLM) objective utilizing the NeMo Megatron GPT implementation.
20
 
@@ -129,7 +130,7 @@ Following Mitchell et al. (2018), we provide a model card for GPT-SW3.
129
  - Model type: GPT-SW3 is a large decoder-only transformer language model.
130
  - Information about training algorithms, parameters, fairness constraints or other applied approaches, and features: GPT-SW3 was trained with the NeMo Megatron GPT implementation.
131
  - Paper or other resource for more information: N/A.
132
- - License: [GPT-SW3 is made available through the modified RAIL license agreement](https://drive.google.com/file/d/1Ssf4ldah66P0Gvk64OkgzMI3JEqL9Ubk/view).
133
  - Where to send questions or comments about the model: nlu@ai.se
134
 
135
  # Intended Use
@@ -158,7 +159,7 @@ Following Mitchell et al. (2018), we provide a model card for GPT-SW3.
158
 
159
  - Books
160
  - Litteraturbanken (https://litteraturbanken.se/)
161
- - The Pile: Books S3
162
 
163
  - Articles
164
  - Diva (https://www.diva-portal.org/)
 
13
  pipeline_tag: conversational
14
  ---
15
  # Model description
16
+ [AI Sweden](https://huggingface.co/AI-Sweden-Models/)
17
+ [GPT-Sw3 126M](https://huggingface.co/AI-Sweden-Models/gpt-sw3-126m/) | [GPT-Sw3 356M](https://huggingface.co/AI-Sweden-Models/gpt-sw3-356m/) | [GPT-Sw3 1.3B](https://huggingface.co/AI-Sweden-Models/gpt-sw3-1.3b/) | [GPT-Sw3 6.7B](https://huggingface.co/AI-Sweden-Models/gpt-sw3-6.7b/) | [GPT-Sw3 6.7B v2](https://huggingface.co/AI-Sweden-Models/gpt-sw3-6.7b-v2/) | [GPT-Sw3 20B](https://huggingface.co/AI-Sweden-Models/gpt-sw3-20b/) | [GPT-Sw3 40B](https://huggingface.co/AI-Sweden-Models/gpt-sw3-40b/)
18
+ [GPT-Sw3 126M Instruct](https://huggingface.co/AI-Sweden-Models/gpt-sw3-126m-instruct/) | [GPT-Sw3 356M Instruct](https://huggingface.co/AI-Sweden-Models/gpt-sw3-356m-instruct/) | [GPT-Sw3 1.3B Instruct](https://huggingface.co/AI-Sweden-Models/gpt-sw3-1.3b-instruct/) | [GPT-Sw3 6.7B v2 Instruct](https://huggingface.co/AI-Sweden-Models/gpt-sw3-6.7b-v2-instruct/) | [GPT-Sw3 20B Instruct](https://huggingface.co/AI-Sweden-Models/gpt-sw3-20b-instruct/)
19
 
20
  GPT-SW3 is a collection of large decoder-only pretrained transformer language models that were developed by AI Sweden in collaboration with RISE and the WASP WARA for Media and Language. GPT-SW3 has been trained on a dataset containing 320B tokens in Swedish, Norwegian, Danish, Icelandic, English, and programming code. The model was pretrained using a causal language modeling (CLM) objective utilizing the NeMo Megatron GPT implementation.
21
 
 
130
  - Model type: GPT-SW3 is a large decoder-only transformer language model.
131
  - Information about training algorithms, parameters, fairness constraints or other applied approaches, and features: GPT-SW3 was trained with the NeMo Megatron GPT implementation.
132
  - Paper or other resource for more information: N/A.
133
+ - License: [LICENSE](https://huggingface.co/AI-Sweden-Models/gpt-sw3-126m-instruct/blob/main/LICENSE).
134
  - Where to send questions or comments about the model: nlu@ai.se
135
 
136
  # Intended Use
 
159
 
160
  - Books
161
  - Litteraturbanken (https://litteraturbanken.se/)
162
+ - The Pile
163
 
164
  - Articles
165
  - Diva (https://www.diva-portal.org/)