BookWorld: From Novels to Interactive Agent Societies for Creative Story Generation Paper • 2504.14538 • Published 29 days ago • 28 • 2
JiraiBench: A Bilingual Benchmark for Evaluating Large Language Models' Detection of Human Self-Destructive Behavior Content in Jirai Community Paper • 2503.21679 • Published Mar 27 • 1
CapArena: Benchmarking and Analyzing Detailed Image Captioning in the LLM Era Paper • 2503.12329 • Published Mar 16 • 25
Implicit Reasoning in Transformers is Reasoning through Shortcuts Paper • 2503.07604 • Published Mar 10 • 22