SafeChain: Safety of Language Models with Long Chain-of-Thought Reasoning Capabilities Paper • 2502.12025 • Published 30 days ago • 1
CleanGen: Mitigating Backdoor Attacks for Generation Tasks in Large Language Models Paper • 2406.12257 • Published Jun 18, 2024
Stronger Models are NOT Stronger Teachers for Instruction Tuning Paper • 2411.07133 • Published Nov 11, 2024 • 36
ChatBug: A Common Vulnerability of Aligned LLMs Induced by Chat Templates Paper • 2406.12935 • Published Jun 17, 2024 • 2
SafeDecoding: Defending against Jailbreak Attacks via Safety-Aware Decoding Paper • 2402.08983 • Published Feb 14, 2024 • 4
ArtPrompt: ASCII Art-based Jailbreak Attacks against Aligned LLMs Paper • 2402.11753 • Published Feb 19, 2024 • 6
BadChain: Backdoor Chain-of-Thought Prompting for Large Language Models Paper • 2401.12242 • Published Jan 20, 2024 • 1
Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing Paper • 2406.08464 • Published Jun 12, 2024 • 67
Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research Paper • 2402.00159 • Published Jan 31, 2024 • 62
Paloma: A Benchmark for Evaluating Language Model Fit Paper • 2312.10523 • Published Dec 16, 2023 • 13
PEEKABOO: Interactive Video Generation via Masked-Diffusion Paper • 2312.07509 • Published Dec 12, 2023 • 12
LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents Paper • 2311.05437 • Published Nov 9, 2023 • 50
DIALGEN: Collaborative Human-LM Generated Dialogues for Improved Understanding of Human-Human Conversations Paper • 2307.07047 • Published Jul 13, 2023 • 16