Expect the Unexpected: FailSafe Long Context QA for Finance Paper β’ 2502.06329 β’ Published Feb 10 β’ 131
Writing in the Margins: Better Inference Pattern for Long Context Retrieval Paper β’ 2408.14906 β’ Published Aug 27, 2024 β’ 142
OmniACT: A Dataset and Benchmark for Enabling Multimodal Generalist Autonomous Agents for Desktop and Web Paper β’ 2402.17553 β’ Published Feb 27, 2024 β’ 25
Becoming self-instruct: introducing early stopping criteria for minimal instruct tuning Paper β’ 2307.03692 β’ Published Jul 5, 2023 β’ 26