Kristijan commited on
Commit
1276734
1 Parent(s): 4b2e4e2

Update README.md

Browse files

add usage descriptions

Files changed (1) hide show
  1. README.md +22 -0
README.md CHANGED
@@ -29,6 +29,28 @@ paper: [Characterizing Verbatim Short-Term Memory in Neural Language Models](htt
29
 
30
  This is a gpt2-small-like decoder-only transformer model trained on a 40M token subset of the [wikitext-103 dataset](https://paperswithcode.com/dataset/wikitext-103).
31
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
32
  # Intended uses
33
 
34
  This checkpoint is intended for research purposes, for example those interested in studying the behavior of transformer language models trained on smaller datasets.
 
29
 
30
  This is a gpt2-small-like decoder-only transformer model trained on a 40M token subset of the [wikitext-103 dataset](https://paperswithcode.com/dataset/wikitext-103).
31
 
32
+ # Usage
33
+
34
+ You can download and load the model as follows:
35
+
36
+ ```python
37
+ from transformers import GPT2LMHeadModel
38
+
39
+ model = GPT2LMHeadModel.from_pretrained("Kristijan/gpt2_wt103-40m_12-layer")
40
+
41
+ ```
42
+
43
+ Alternatively, if you've downloaded the checkpoint files in this repository, you could also do:
44
+
45
+ ```python
46
+ from transformers import GPT2LMHeadModel
47
+
48
+ model = GPT2LMHeadModel.from_pretrained(path_to_folder_with_checkpoint_files)
49
+
50
+ ```
51
+
52
+ To tokenize your text for this model, you should use the [tokenizer trained on Wikitext-103](https://huggingface.co/Kristijan/wikitext-103-tokenizer)
53
+
54
  # Intended uses
55
 
56
  This checkpoint is intended for research purposes, for example those interested in studying the behavior of transformer language models trained on smaller datasets.