Note: please check DeepKPG for using this model in huggingface, including setting up the newly trained tokenizer.

Paper: Pre-trained Language Models for Keyphrase Generation: A Thorough Empirical Study

@article{https://doi.org/10.48550/arxiv.2212.10233,
  doi = {10.48550/ARXIV.2212.10233},
  url = {https://arxiv.org/abs/2212.10233},
  author = {Wu, Di and Ahmad, Wasi Uddin and Chang, Kai-Wei},
  keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
  title = {Pre-trained Language Models for Keyphrase Generation: A Thorough Empirical Study},
  publisher = {arXiv},
  year = {2022}, 
  copyright = {Creative Commons Attribution 4.0 International}
}

Pre-training Corpus: S2ORC (titles and abstracts)

Pre-training Details:

  • Pre-trained from scratch with a science vocabulary
  • Batch size: 2048
  • Total steps: 250k
  • Learning rate: 3e-4
  • LR schedule: polynomial with 10k warmup steps
  • Masking ratio: 30%, Poisson lambda = 3.5
Downloads last month
11
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.