ToolBeHonest: A Multi-level Hallucination Diagnostic Benchmark for Tool-Augmented Large Language Models Paper • 2406.20015 • Published Jun 28 • 1
ToolBeHonest: A Multi-level Hallucination Diagnostic Benchmark for Tool-Augmented Large Language Models Paper • 2406.20015 • Published Jun 28 • 1
Fengshenbang 1.0: Being the Foundation of Chinese Cognitive Intelligence Paper • 2209.02970 • Published Sep 7, 2022
Solving Math Word Problems via Cooperative Reasoning induced Language Models Paper • 2210.16257 • Published Oct 28, 2022
EALM: Introducing Multidimensional Ethical Alignment in Conversational Information Retrieval Paper • 2310.00970 • Published Oct 2, 2023
MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model Series Paper • 2405.19327 • Published May 29 • 46
ChartMimic: Evaluating LMM's Cross-Modal Reasoning Capability via Chart-to-Code Generation Paper • 2406.09961 • Published Jun 14 • 54
PIN: A Knowledge-Intensive Dataset for Paired and Interleaved Multimodal Documents Paper • 2406.13923 • Published Jun 20 • 21
ChartMimic: Evaluating LMM's Cross-Modal Reasoning Capability via Chart-to-Code Generation Paper • 2406.09961 • Published Jun 14 • 54