neulab
/

gpt2-med-finetuned-wikitext103

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

gpt2-med-finetuned-wikitext103 / README.md

urialon's picture

Update README.md

91221bf over 2 years ago

|

history blame contribute delete

1.16 kB

	This is a `gpt2-medium` model, finetuned on the Wikitext-103 dataset.

	It achieves a perplexity of 11.55 using a "sliding window" context, using the `run_clm.py` script at [https://github.com/neulab/knn-transformers](https://github.com/neulab/knn-transformers).

	\| Base LM: \| `distilgpt2` \| `gpt2` \|
	\| :--- \| ----: \| ---: \|
	\| base perplexity \| 18.25 \| 14.84 \|
	\| + kNN-LM \| 15.03 \| 12.57 \|
	\| + RetoMaton \| 14.70 \| 12.46 \|

	This model was released as part of the paper ["Neuro-Symbolic Language Modeling with Automaton-augmented Retrieval"](https://arxiv.org/pdf/2201.12431.pdf) (ICML'2022).

	For more information, see: [https://github.com/neulab/knn-transformers](https://github.com/neulab/knn-transformers)

	If you use this model, please cite:
	```
	@inproceedings{alon2022neuro,
	title={Neuro-Symbolic Language Modeling with Automaton-augmented Retrieval},
	author={Alon, Uri and Xu, Frank and He, Junxian and Sengupta, Sudipta and Roth, Dan and Neubig, Graham},
	booktitle={International Conference on Machine Learning},
	pages={468--485},
	year={2022},
	organization={PMLR}
	}
	```