gregH commited on
Commit
2d4556c
1 Parent(s): b92dddc

Update index.html

Browse files
Files changed (1) hide show
  1. index.html +4 -7
index.html CHANGED
@@ -143,13 +143,10 @@ Exploring Refusal Loss Landscapes </title>
143
  <p>
144
  Gradient Cuff can be summarized into two phases:
145
  <span>
146
- $$
147
- \begin{itemize}
148
- \item \textbf{(Phase 1) Sampling-based Rejection:}~In the first step, we reject the user query $x$ by checking whether $f_\theta(x)<0.5$. If true, then $x$ is rejected, otherwise, $x$ is pushed into phase 2.
149
- \item \textbf{(Phase 2) Gradient Norm Rejection:}~In the second step, we regard $x$ as having jailbreak attempts if the norm of the estimated gradient $g_\theta(x)$ is larger than a configurable threshold $t$, i.e., $\|g_\theta(x)\| > t$.
150
- \end{itemize}
151
- $$
152
- </span>
153
  </p>
154
 
155
 
 
143
  <p>
144
  Gradient Cuff can be summarized into two phases:
145
  <span>
146
+ <strong>(Phase 1) Sampling-based Rejection:</strong> In the first step, we reject the user query by checking whether $f_\theta(x)<0.5$. If true, then $x$ is rejected, otherwise, $x$ is pushed into phase 2.
147
+ </p>
148
+ <p>
149
+ <strong>(Phase 2) Gradient Norm Rejection:</strong> In the second step, we regard $x$ as having jailbreak attempts if the norm of the estimated gradient $g_\theta(x)$ is larger than a configurable threshold $t$, i.e., $\|g_\theta(x)\| > t$.
 
 
 
150
  </p>
151
 
152