ComfyGen: Prompt-Adaptive Workflows for Text-to-Image Generation Paper • 2410.01731 • Published Oct 2 • 16
CoverBench: A Challenging Benchmark for Complex Claim Verification Paper • 2408.03325 • Published Aug 6 • 14
CoverBench: A Challenging Benchmark for Complex Claim Verification Paper • 2408.03325 • Published Aug 6 • 14
Visual Riddles: a Commonsense and World Knowledge Challenge for Large Vision and Language Models Paper • 2407.19474 • Published Jul 28 • 23
Evaluating the Ripple Effects of Knowledge Editing in Language Models Paper • 2307.12976 • Published Jul 24, 2023 • 11
Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers Paper • 2103.15679 • Published Mar 29, 2021
Transformer Interpretability Beyond Attention Visualization Paper • 2012.09838 • Published Dec 17, 2020
Answering Questions by Meta-Reasoning over Multiple Chains of Thought Paper • 2304.13007 • Published Apr 25, 2023 • 1
Making Retrieval-Augmented Language Models Robust to Irrelevant Context Paper • 2310.01558 • Published Oct 2, 2023 • 2
AssistantBench: Can Web Agents Solve Realistic and Time-Consuming Tasks? Paper • 2407.15711 • Published Jul 22 • 9
From Loops to Oops: Fallback Behaviors of Language Models Under Uncertainty Paper • 2407.06071 • Published Jul 8 • 7
DataComp-LM: In search of the next generation of training sets for language models Paper • 2406.11794 • Published Jun 17 • 50
From Loops to Oops: Fallback Behaviors of Language Models Under Uncertainty Paper • 2407.06071 • Published Jul 8 • 7
Hopping Too Late: Exploring the Limitations of Large Language Models on Multi-Hop Queries Paper • 2406.12775 • Published Jun 18
Intrinsic Evaluation of Unlearning Using Parametric Knowledge Traces Paper • 2406.11614 • Published Jun 17 • 4
In-Context Learning with Long-Context Models: An In-Depth Exploration Paper • 2405.00200 • Published Apr 30
Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length Paper • 2404.08801 • Published Apr 12 • 63