Guardians of the Agentic System: Preventing Many Shots Jailbreak with Agentic System Paper β’ 2502.16750 β’ Published 23 days ago β’ 10
SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution Paper β’ 2502.18449 β’ Published 21 days ago β’ 70
Beyond Release: Access Considerations for Generative AI Systems Paper β’ 2502.16701 β’ Published 23 days ago β’ 12
Slamming: Training a Speech Language Model on One GPU in a Day Paper β’ 2502.15814 β’ Published 27 days ago β’ 66
GΓΆdel Agent: A Self-Referential Agent Framework for Recursive Self-Improvement Paper β’ 2410.04444 β’ Published Oct 6, 2024 β’ 2
Tree of Thoughts: Deliberate Problem Solving with Large Language Models Paper β’ 2305.10601 β’ Published May 17, 2023 β’ 12
Chain of Hindsight Aligns Language Models with Feedback Paper β’ 2302.02676 β’ Published Feb 6, 2023 β’ 1
ReAct: Synergizing Reasoning and Acting in Language Models Paper β’ 2210.03629 β’ Published Oct 6, 2022 β’ 24
Reflexion: Language Agents with Verbal Reinforcement Learning Paper β’ 2303.11366 β’ Published Mar 20, 2023 β’ 5
Demonstrating specification gaming in reasoning models Paper β’ 2502.13295 β’ Published 28 days ago β’ 1