migueldeguzmandev
commited on
Commit
•
9bdf0cf
1
Parent(s):
ebab6ad
Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,5 @@
|
|
|
|
|
|
1 |
RLLMv7 / This experiment: [Can RLLMv3's ability to defend against jailbreaks be attributed to datasets containing stories about Jung's shadow integration theory?](https://www.lesswrong.com/posts/Rc6hb48nq38QrQ7qb/can-rllmv3-s-ability-to-defend-against-jailbreaks-be)
|
2 |
|
3 |
GPT2XL_RLLMv3 Post: [BetterDAN, AI Machiavelli & Oppo Jailbreaks vs. SOTA models & GPT2XL_RLLMv3](https://www.lesswrong.com/posts/vZ5fM6FtriyyKbwi9/betterdan-ai-machiavelli-and-oppo-jailbreaks-vs-sota-models?utm_campaign=post_share&utm_source=link)
|
|
|
1 |
+
Research wireframe: [Click here!](https://whimsical.com/the-rllm-wireframe-QQvFHNr6aVDdXRUnyb5NCu)
|
2 |
+
|
3 |
RLLMv7 / This experiment: [Can RLLMv3's ability to defend against jailbreaks be attributed to datasets containing stories about Jung's shadow integration theory?](https://www.lesswrong.com/posts/Rc6hb48nq38QrQ7qb/can-rllmv3-s-ability-to-defend-against-jailbreaks-be)
|
4 |
|
5 |
GPT2XL_RLLMv3 Post: [BetterDAN, AI Machiavelli & Oppo Jailbreaks vs. SOTA models & GPT2XL_RLLMv3](https://www.lesswrong.com/posts/vZ5fM6FtriyyKbwi9/betterdan-ai-machiavelli-and-oppo-jailbreaks-vs-sota-models?utm_campaign=post_share&utm_source=link)
|