|
This is a `gpt2-medium` model, finetuned on the Wikitext-103 dataset. |
|
|
|
It achieves a perplexity of **11.55** using a "sliding window" context, using the `run_clm.py` script at [https://github.com/neulab/knn-transformers](https://github.com/neulab/knn-transformers). |
|
|
|
| Base LM: | `distilgpt2` | `gpt2` | |
|
| :--- | ----: | ---: | |
|
| base perplexity | 18.25 | 14.84 | |
|
| + kNN-LM | 15.03 | 12.57 | |
|
| + RetoMaton | **14.70** | **12.46** | |
|
|
|
This model was released as part of the paper ["Neuro-Symbolic Language Modeling with Automaton-augmented Retrieval"](https://arxiv.org/pdf/2201.12431.pdf) (ICML'2022). |
|
|
|
For more information, see: [https://github.com/neulab/knn-transformers](https://github.com/neulab/knn-transformers) |
|
|
|
If you use this model, please cite: |
|
``` |
|
@inproceedings{alon2022neuro, |
|
title={Neuro-Symbolic Language Modeling with Automaton-augmented Retrieval}, |
|
author={Alon, Uri and Xu, Frank and He, Junxian and Sengupta, Sudipta and Roth, Dan and Neubig, Graham}, |
|
booktitle={International Conference on Machine Learning}, |
|
pages={468--485}, |
|
year={2022}, |
|
organization={PMLR} |
|
} |
|
``` |