Edit model card

Model description

paper: Characterizing Verbatim Short-Term Memory in Neural Language Models

This is a gpt2-small-like decoder-only transformer model trained on a the wikitext-103 dataset.


You can download and load the model as follows:

from transformers import GPT2LMHeadModel

model = GPT2LMHeadModel.from_pretrained("Kristijan/gpt2_wt103_12-layer")

Alternatively, if you've downloaded the checkpoint files in this repository, you could also do:

from transformers import GPT2LMHeadModel

model = GPT2LMHeadModel.from_pretrained(path_to_folder_with_checkpoint_files)

BPE Tokenizer

You should first pretokenize your text using the MosesTokenizer:

from mosestokenizer import MosesTokenizer

with MosesTokenizer('en') as pretokenize:
    pretokenized_text = " ".join(pretokenize(text_string))

Then, to BPE tokenize your text for this model, you should use the tokenizer trained on Wikitext-103:

from transformers import GPT2TokenizerFast

tokenizer = GPT2TokenizerFast.from_pretrained("Kristijan/wikitext-103-tokenizer_v2")
tokenized_text = tokenizer.tokenize(pretokenized_text)

Intended uses

This checkpoint is intended for research purposes, for example those interested in studying the behavior of transformer language models trained on smaller datasets.

Downloads last month
Unable to determine this model’s pipeline type. Check the docs .

Evaluation results