gregH commited on
Commit
e5cb574
·
verified ·
1 Parent(s): fc5cf53

Update index.html

Browse files
Files changed (1) hide show
  1. index.html +1 -1
index.html CHANGED
@@ -158,7 +158,7 @@ We provide more details about the running flow of Gradient Cuff in the paper.
158
  <h2 id="demonstration">Demonstration</h2>
159
  <p>We evaluated Gradient Cuff as well as 4 baselines (Perplexity Filter, SmoothLLM, Erase-and-Check, and Self-Reminder) against 6
160
  different jailbreak attacks~(GCG, AutoDAN, PAIR, TAP, Base64, and LRL) and benign user queries on 2 LLMs (LLaMA-2-7B-Chat and Vicuna-7B-V1.5).
161
- We demonstrate the average refusal rate across these 6 malicious user query datasets and the refusal rate
162
  on benign user queries as the Benign Refusal Rate.
163
  </p>
164
 
 
158
  <h2 id="demonstration">Demonstration</h2>
159
  <p>We evaluated Gradient Cuff as well as 4 baselines (Perplexity Filter, SmoothLLM, Erase-and-Check, and Self-Reminder) against 6
160
  different jailbreak attacks~(GCG, AutoDAN, PAIR, TAP, Base64, and LRL) and benign user queries on 2 LLMs (LLaMA-2-7B-Chat and Vicuna-7B-V1.5).
161
+ We demonstrate the average refusal rate across these 6 malicious user query datasets as the Average Malicious Refusal Rate and the refusal rate
162
  on benign user queries as the Benign Refusal Rate.
163
  </p>
164