How to use this model directly from the 🤗/transformers library:

from transformers import AutoTokenizer, AutoModel tokenizer = AutoTokenizer.from_pretrained("lysandre/arxiv-nlp") model = AutoModel.from_pretrained("lysandre/arxiv-nlp")

ArXiv-NLP GPT-2 checkpoint

This is a GPT-2 small checkpoint for PyTorch. It is the official gpt2-small fine-tuned to ArXiv paper on the computational linguistics field.

Training data

This model was trained on a subset of ArXiv papers that were parsed from PDF to txt. The resulting data is made of 80MB of text from the computational linguistics (cs.CL) field.