Breaking the Barrier: Enhanced Utility and Robustness in Smoothed DRL Agents Paper • 2406.18062 • Published Jun 26, 2024 • 1
Iterative Self-Tuning LLMs for Enhanced Jailbreaking Capabilities Paper • 2410.18469 • Published Oct 24, 2024 • 1
Effective Skill Unlearning through Intervention and Abstention Paper • 2503.21730 • Published 20 days ago • 1
ThinkEdit: Interpretable Weight Editing to Mitigate Overly Short Thinking in Reasoning Models Paper • 2503.22048 • Published 20 days ago • 2
Iterative Self-Tuning LLMs for Enhanced Jailbreaking Capabilities Paper • 2410.18469 • Published Oct 24, 2024 • 1