TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension Paper • 1705.03551 • Published May 9, 2017
StaQC: A Systematically Mined Question-Code Dataset from Stack Overflow Paper • 1803.09371 • Published Mar 26, 2018
SpanBERT: Improving Pre-training by Representing and Predicting Spans Paper • 1907.10529 • Published Jul 24, 2019
Pretrained Language Models for Sequential Sentence Classification Paper • 1909.04054 • Published Sep 9, 2019
The Newspaper Navigator Dataset: Extracting And Analyzing Visual Content from 16 Million Historic Newspaper Pages in Chronicling America Paper • 2005.01583 • Published May 4, 2020 • 2
SPECTER: Document-level Representation Learning using Citation-informed Transformers Paper • 2004.07180 • Published Apr 15, 2020
Polyjuice: Generating Counterfactuals for Explaining, Evaluating, and Improving Models Paper • 2101.00288 • Published Jan 1, 2021
A Search Engine for Discovery of Scientific Challenges and Directions Paper • 2108.13751 • Published Aug 31, 2021
Don't Say What You Don't Know: Improving the Consistency of Abstractive Summarization by Constraining Beam Search Paper • 2203.08436 • Published Mar 16, 2022
ArxivDIGESTables: Synthesizing Scientific Literature into Tables using Language Models Paper • 2410.22360 • Published Oct 25, 2024