Agent-SafetyBench: Evaluating the Safety of LLM Agents Paper • 2412.14470 • Published 11 days ago • 8
SPaR: Self-Play with Tree-Search Refinement to Improve Instruction-Following in Large Language Models Paper • 2412.11605 • Published 14 days ago • 15