HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering Paper • 1809.09600 • Published Sep 25, 2018 • 4
Answering Complex Open-domain Questions Through Iterative Query Generation Paper • 1910.07000 • Published Oct 15, 2019
Neural Generation Meets Real People: Building a Social, Informative Open-Domain Dialogue Agent Paper • 2207.12021 • Published Jul 25, 2022
SpanDrop: Simple and Effective Counterfactual Learning for Long Sequences Paper • 2208.02169 • Published Aug 3, 2022 • 1
RAG-QA Arena: Evaluating Domain Robustness for Long-form Retrieval Augmented Question Answering Paper • 2407.13998 • Published Jul 19, 2024
Is Your LLM Secretly a World Model of the Internet? Model-Based Planning for Web Agents Paper • 2411.06559 • Published Nov 10, 2024 • 16
Presenting a Paper is an Art: Self-Improvement Aesthetic Agents for Academic Presentations Paper • 2510.05571 • Published Oct 7, 2025 • 15
Towards Policy-Compliant Agents: Learning Efficient Guardrails For Policy Violation Detection Paper • 2510.03485 • Published Oct 3, 2025
Stanza: A Python Natural Language Processing Toolkit for Many Human Languages Paper • 2003.07082 • Published Mar 16, 2020
WARC-Bench: Web Archive Based Benchmark for GUI Subtask Executions Paper • 2510.09872 • Published Oct 10, 2025