SafeChain: Safety of Language Models with Long Chain-of-Thought Reasoning Capabilities Paper • 2502.12025 • Published 30 days ago • 1
Negative Token Merging: Image-based Adversarial Feature Guidance Paper • 2412.01339 • Published Dec 2, 2024 • 23
CleanGen: Mitigating Backdoor Attacks for Generation Tasks in Large Language Models Paper • 2406.12257 • Published Jun 18, 2024
Stronger Models are NOT Stronger Teachers for Instruction Tuning Paper • 2411.07133 • Published Nov 11, 2024 • 36
Improve Vision Language Model Chain-of-thought Reasoning Paper • 2410.16198 • Published Oct 21, 2024 • 26
Revisit Large-Scale Image-Caption Data in Pre-training Multimodal Foundation Models Paper • 2410.02740 • Published Oct 3, 2024 • 52
MM1.5: Methods, Analysis & Insights from Multimodal LLM Fine-tuning Paper • 2409.20566 • Published Sep 30, 2024 • 56
StyleRemix: Interpretable Authorship Obfuscation via Distillation and Perturbation of Style Elements Paper • 2408.15666 • Published Aug 28, 2024 • 11
ChatBug: A Common Vulnerability of Aligned LLMs Induced by Chat Templates Paper • 2406.12935 • Published Jun 17, 2024 • 2
Visual Sketchpad: Sketching as a Visual Chain of Thought for Multimodal Language Models Paper • 2406.09403 • Published Jun 13, 2024 • 21
SafeDecoding: Defending against Jailbreak Attacks via Safety-Aware Decoding Paper • 2402.08983 • Published Feb 14, 2024 • 4
ArtPrompt: ASCII Art-based Jailbreak Attacks against Aligned LLMs Paper • 2402.11753 • Published Feb 19, 2024 • 6
BadChain: Backdoor Chain-of-Thought Prompting for Large Language Models Paper • 2401.12242 • Published Jan 20, 2024 • 1