Scalable Extraction of Training Data from (Production) Language Models
Paper
•
2311.17035
•
Published
•
4
https://llm-attacks.org/
Note ✅ Backdoor Traps ✅ Honeypot Schemes
Note Large language models (LLMs) currently operate without a hierarchy of instruction privilege, leaving them vulnerable to attacks similar to those experienced in early operating systems. This paper proposes establishing an instruction hierarchy within LLMs, prioritizing higher-privileged instructions to mitigate the vulnerabilities and enhance security.