KennethTM
/

gpt2-small-danish

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

KennethTM commited on Jun 18, 2023

Commit

2649b38

•

1 Parent(s): eebe77d

Update README.md

Files changed (1) hide show

README.md +25 -3

README.md CHANGED Viewed

@@ -9,17 +9,39 @@ widget:
 # What is this?
-GPT-2 model (small version, 124 M parameters) for Danish text generation.
 # Model training
 The model is trained using the Danish part of the [oscar dataset](https://huggingface.co/datasets/oscar) ('unshuffled_deduplicated_da') and a context length of 1024 tokens.
-The model is initilized from the English [GPT-2 small model](https://huggingface.co/gpt2) with new word token embeddings created for Danish using [WECHSEL](https://github.com/CPJKU/wechsel).
 Initially, only the word token embeddings are trained using 50.000 samples. Finally, the whole model is trained using 1.000.000 samples.
-Model training is carried out on a 8 GB GPU.
 # Notes

 # What is this?
+A GPT-2 model (small version, 124 M parameters) for Danish text generation. The model was not pre-trained from scratch but adapted from the English version.
+# How to use
+Test the model using the pipeline from the [🤗 Transformers](https://github.com/huggingface/transformers) library:
+```python
+from transformers import pipeline
+generator = pipeline("text-generation", model="gpt2-small-danish")
+text = generator("Manden arbejdede som")
+print(text[0]["generated_text"])
+```
+Or load it using the Auto* classes:
+```python
+from transformers import AutoTokenizer, AutoModelForCausalLM
+tokenizer = AutoTokenizer.from_pretrained("KennethTM/gpt2-small-danish")
+model = AutoModelForCausalLM.from_pretrained("KennethTM/gpt2-small-danish")
+```
 # Model training
 The model is trained using the Danish part of the [oscar dataset](https://huggingface.co/datasets/oscar) ('unshuffled_deduplicated_da') and a context length of 1024 tokens.
+The model is initialized from the English [GPT-2 small model](https://huggingface.co/gpt2) with new word token embeddings created for Danish using [WECHSEL](https://github.com/CPJKU/wechsel).
 Initially, only the word token embeddings are trained using 50.000 samples. Finally, the whole model is trained using 1.000.000 samples.
+Model training is carried out on an 8 GB GPU.
 # Notes