URSA: Understanding and Verifying Chain-of-thought Reasoning in Multimodal Mathematics Paper • 2501.04686 • Published 1 day ago • 35
Flames: Benchmarking Value Alignment of LLMs in Chinese Paper • 2311.06899 • Published Nov 12, 2023 • 2