Update index.html
Browse files- index.html +4 -1
index.html
CHANGED
@@ -132,7 +132,10 @@ Exploring Refusal Loss Landscapes </title>
|
|
132 |
<ul>
|
133 |
<li>Paper: <a href="https://arxiv.org/abs/2310.08419" target="_blank" rel="noopener noreferrer">
|
134 |
Jailbreaking Black Box Large Language Models in Twenty Queries</a></li>
|
135 |
-
<li>Brief Introduction:
|
|
|
|
|
|
|
136 |
</ul>
|
137 |
</div>
|
138 |
<h3>TAP</h3>
|
|
|
132 |
<ul>
|
133 |
<li>Paper: <a href="https://arxiv.org/abs/2310.08419" target="_blank" rel="noopener noreferrer">
|
134 |
Jailbreaking Black Box Large Language Models in Twenty Queries</a></li>
|
135 |
+
<li>Brief Introduction: PAIR uses an attacker LLM to automatically generate jailbreaks for a separate targeted LLM
|
136 |
+
without human intervention. The attacker LLM iteratively queries the target LLM to update and refine a candidate
|
137 |
+
jailbreak based on the comments and the rated score provided by another Judge model.
|
138 |
+
Empirically, PAIR often requires fewer than twenty queries to produce a successful jailbreak.</li>
|
139 |
</ul>
|
140 |
</div>
|
141 |
<h3>TAP</h3>
|