Full code and details at https://github.com/csinva/gpt-paper-title-generator

Model

from transformers import AutoModelForCausalLM, pipeline, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("csinva/gpt-neo-2.7B-titles")
tokenizer = AutoTokenizer.from_pretrained("EleutherAI/gpt-neo-2.7B")
pipe = pipeline('text-generation', model=model, tokenizer=tokenizer)
pipe('2022\n\n')

Data

  • all papers on arXiv in the categories cs.AI, cs.LG, stat.ML
    • date cutoff: only finetuned on papers with dat on or before Apr 1, 2022
    • random 5% of papers also excluded
    • this results in 98,388 papers for finetuning
  • during finetuning each paper title was given starting with the prompt <year>\n\n <title>\n (e.g. 2022\n\n Emb-GAM: an Interpretable and Efficient Predictor using Pre-trained Language Models\n)
Downloads last month
11
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.