AI-Sweden-Models
/

gpt-sw3-6.7b-v2-instruct

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Ekgren commited on Nov 16, 2023

Commit

81ca95a

·

1 Parent(s): 4530a58

Update README.md

Files changed (1) hide show

README.md +4 -3

README.md CHANGED Viewed

@@ -13,8 +13,9 @@ language:
 pipeline_tag: conversational
 ---
 # Model description
-[AI Sweden](https://huggingface.co/AI-Sweden/)
-[GPT-Sw3 126M instruct](https://huggingface.co/AI-Sweden-Models/gpt-sw3-126m-instruct/) | [GPT-Sw3 356M instruct](https://huggingface.co/AI-Sweden-Models/gpt-sw3-356m/) | [GPT-Sw3 1.3B instruct](https://huggingface.co/AI-Sweden-Models/gpt-sw3-1.3b-instruct/) | [GPT-Sw3 6.7B instruct](https://huggingface.co/AI-Sweden-Models/gpt-sw3-6.7b-instruct/) | [GPT-Sw3 20B instruct](https://huggingface.co/AI-Sweden-Models/gpt-sw3-20b-instruct/)
 GPT-SW3 is a collection of large decoder-only pretrained transformer language models that were developed by AI Sweden in collaboration with RISE and the WASP WARA for Media and Language. GPT-SW3 has been trained on a dataset containing 320B tokens in Swedish, Norwegian, Danish, Icelandic, English, and programming code. The model was pretrained using a causal language modeling (CLM) objective utilizing the NeMo Megatron GPT implementation.
@@ -158,7 +159,7 @@ Following Mitchell et al. (2018), we provide a model card for GPT-SW3.
 - Books
     - Litteraturbanken (https://litteraturbanken.se/)
-    - The Pile: Books S3
 - Articles
     - Diva (https://www.diva-portal.org/)

 pipeline_tag: conversational
 ---
 # Model description
+[AI Sweden](https://huggingface.co/AI-Sweden-Models/)
+[GPT-Sw3 126M](https://huggingface.co/AI-Sweden-Models/gpt-sw3-126m/) | [GPT-Sw3 356M](https://huggingface.co/AI-Sweden-Models/gpt-sw3-356m/) | [GPT-Sw3 1.3B](https://huggingface.co/AI-Sweden-Models/gpt-sw3-1.3b/) | [GPT-Sw3 6.7B](https://huggingface.co/AI-Sweden-Models/gpt-sw3-6.7b/) | [GPT-Sw3 6.7B v2](https://huggingface.co/AI-Sweden-Models/gpt-sw3-6.7b-v2/) | [GPT-Sw3 20B](https://huggingface.co/AI-Sweden-Models/gpt-sw3-20b/) | [GPT-Sw3 40B](https://huggingface.co/AI-Sweden-Models/gpt-sw3-40b/)
+[GPT-Sw3 126M Instruct](https://huggingface.co/AI-Sweden-Models/gpt-sw3-126m-instruct/) | [GPT-Sw3 356M Instruct](https://huggingface.co/AI-Sweden-Models/gpt-sw3-356m-instruct/) | [GPT-Sw3 1.3B Instruct](https://huggingface.co/AI-Sweden-Models/gpt-sw3-1.3b-instruct/) | [GPT-Sw3 6.7B v2 Instruct](https://huggingface.co/AI-Sweden-Models/gpt-sw3-6.7b-v2-instruct/) | [GPT-Sw3 20B Instruct](https://huggingface.co/AI-Sweden-Models/gpt-sw3-20b-instruct/)
 GPT-SW3 is a collection of large decoder-only pretrained transformer language models that were developed by AI Sweden in collaboration with RISE and the WASP WARA for Media and Language. GPT-SW3 has been trained on a dataset containing 320B tokens in Swedish, Norwegian, Danish, Icelandic, English, and programming code. The model was pretrained using a causal language modeling (CLM) objective utilizing the NeMo Megatron GPT implementation.
 - Books
     - Litteraturbanken (https://litteraturbanken.se/)
+    - The Pile
 - Articles
     - Diva (https://www.diva-portal.org/)