File size: 1,161 Bytes
95c9c9a
 
 
 
 
 
 
 
 
 
 
020e5d2
8ad278e
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
This is a `gpt2-large` model, finetuned on the Wikitext-103 dataset.

It achieves a perplexity of **10.56** using a "sliding window" context, using the `run_clm.py` script at [https://github.com/neulab/knn-transformers](https://github.com/neulab/knn-transformers).

| Base LM:        | `distilgpt2` | `gpt2` | 
| :---            |    ----:   |     ---: |
| base perplexity | 18.25      | 14.84    |
| + kNN-LM          |  15.03     |   12.57  |
| + RetoMaton       | **14.70**  |  **12.46**    |

This model was released as part of the paper ["Neuro-Symbolic Language Modeling with Automaton-augmented Retrieval"](https://arxiv.org/pdf/2201.12431.pdf) (ICML'2022).

For more information, see: [https://github.com/neulab/knn-transformers](https://github.com/neulab/knn-transformers)

If you use this model, please cite:
```
@inproceedings{alon2022neuro,
  title={Neuro-Symbolic Language Modeling with Automaton-augmented Retrieval},
  author={Alon, Uri and Xu, Frank and He, Junxian and Sengupta, Sudipta and Roth, Dan and Neubig, Graham},
  booktitle={International Conference on Machine Learning},
  pages={468--485},
  year={2022},
  organization={PMLR}
}
```