MobileAIBench: Benchmarking LLMs and LMMs for On-Device Use Cases Paper • 2406.10290 • Published Jun 12, 2024
CHAMP: A Competition-level Dataset for Fine-Grained Analyses of LLMs' Mathematical Reasoning Capabilities Paper • 2401.06961 • Published Jan 13, 2024
Towards Understanding the Behaviors of Optimal Deep Active Learning Algorithms Paper • 2101.00977 • Published Dec 29, 2020
Can Large Language Models Explain Themselves? A Study of LLM-Generated Self-Explanations Paper • 2310.11207 • Published Oct 17, 2023