Kristijan commited on
Commit
4b2e4e2
1 Parent(s): ee733be

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -1
README.md CHANGED
@@ -23,4 +23,12 @@ model-index:
23
 
24
  ---
25
 
26
- #
 
 
 
 
 
 
 
 
 
23
 
24
  ---
25
 
26
+ # Model description
27
+
28
+ paper: [Characterizing Verbatim Short-Term Memory in Neural Language Models](https://doi.org/10.48550/arXiv.2210.13569)
29
+
30
+ This is a gpt2-small-like decoder-only transformer model trained on a 40M token subset of the [wikitext-103 dataset](https://paperswithcode.com/dataset/wikitext-103).
31
+
32
+ # Intended uses
33
+
34
+ This checkpoint is intended for research purposes, for example those interested in studying the behavior of transformer language models trained on smaller datasets.