Update index.html
Browse files- index.html +10 -9
index.html
CHANGED
@@ -4,13 +4,14 @@
|
|
4 |
<meta charset="UTF-8">
|
5 |
|
6 |
<!-- Begin Jekyll SEO tag v2.8.0 -->
|
7 |
-
<title>
|
8 |
-
|
|
|
9 |
<meta property="og:locale" content="en_US" />
|
10 |
-
<meta name="description" content="
|
11 |
-
<meta property="og:description" content="
|
12 |
<script type="application/ld+json">
|
13 |
-
{"@context":"https://schema.org","@type":"WebSite","description":"
|
14 |
<!-- End Jekyll SEO tag -->
|
15 |
|
16 |
<link rel="preconnect" href="https://fonts.gstatic.com">
|
@@ -45,8 +46,8 @@
|
|
45 |
<a id="skip-to-content" href="#content">Skip to the content.</a>
|
46 |
|
47 |
<header class="page-header" role="banner">
|
48 |
-
<h1 class="project-name">
|
49 |
-
<h2 class="project-tagline">
|
50 |
|
51 |
|
52 |
</header>
|
@@ -62,7 +63,7 @@ our proposed framework <strong>Neural Clamping</strong>, which employs a simple
|
|
62 |
transformation on a pre-trained classifier. We also provide other calibration approaches
|
63 |
(e.g., temperature scaling) to compare with Neural Clamping.</p>
|
64 |
|
65 |
-
<h2 id="what-is-
|
66 |
<p>Neural Network Calibration seeks to make model prediction align with its true correctness likelihood.
|
67 |
A well-calibrated model should provide accurate predictions and reliable confidence when making inferences. On the
|
68 |
contrary, a poor calibration model would have a wide gap between its accuracy and average confidence level.
|
@@ -196,7 +197,7 @@ Using this tool, users can use our proposed package, \(\texttt{NCTookit}\), to c
|
|
196 |
<p>If you find Neural Clamping helpful and useful for your research, please cite our main paper as follows:</p>
|
197 |
|
198 |
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>@inproceedings{hsiung2023nctv,
|
199 |
-
title={{NCTV:
|
200 |
author={Lei Hsiung, Yung-Chen Tang and Pin-Yu Chen and Tsung-Yi Ho},
|
201 |
booktitle={Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence},
|
202 |
publisher={Association for the Advancement of Artificial Intelligence},
|
|
|
4 |
<meta charset="UTF-8">
|
5 |
|
6 |
<!-- Begin Jekyll SEO tag v2.8.0 -->
|
7 |
+
<title>Gradient Cuff | Gradient Cuff: Detecting Jailbreak Attacks on Large Language Models by
|
8 |
+
Exploring Refusal Loss Landscapes </title>
|
9 |
+
<meta property="og:title" content="Gradient Cuff" />
|
10 |
<meta property="og:locale" content="en_US" />
|
11 |
+
<meta name="description" content="Gradient Cuff: Detecting Jailbreak Attacks on Large Language Models by Exploring Refusal Loss Landscapes" />
|
12 |
+
<meta property="og:description" content="Gradient Cuff: Detecting Jailbreak Attacks on Large Language Models by Exploring Refusal Loss Landscapes" />
|
13 |
<script type="application/ld+json">
|
14 |
+
{"@context":"https://schema.org","@type":"WebSite","description":"Gradient Cuff: Detecting Jailbreak Attacks on Large Language Models by Exploring Refusal Loss Landscapes","headline":"Gradient Cuff","name":"Gradient Cuff","url":"https://huggingface.co/spaces/gregH/Gradient Cuff"}</script>
|
15 |
<!-- End Jekyll SEO tag -->
|
16 |
|
17 |
<link rel="preconnect" href="https://fonts.gstatic.com">
|
|
|
46 |
<a id="skip-to-content" href="#content">Skip to the content.</a>
|
47 |
|
48 |
<header class="page-header" role="banner">
|
49 |
+
<h1 class="project-name">Gradient Cuff</h1>
|
50 |
+
<h2 class="project-tagline">Gradient Cuff: Detecting Jailbreak Attacks on Large Language Models by Exploring Refusal Loss Landscapes</h2>
|
51 |
|
52 |
|
53 |
</header>
|
|
|
63 |
transformation on a pre-trained classifier. We also provide other calibration approaches
|
64 |
(e.g., temperature scaling) to compare with Neural Clamping.</p>
|
65 |
|
66 |
+
<h2 id="what-is-jailbreak">What is Calibration?</h2>
|
67 |
<p>Neural Network Calibration seeks to make model prediction align with its true correctness likelihood.
|
68 |
A well-calibrated model should provide accurate predictions and reliable confidence when making inferences. On the
|
69 |
contrary, a poor calibration model would have a wide gap between its accuracy and average confidence level.
|
|
|
197 |
<p>If you find Neural Clamping helpful and useful for your research, please cite our main paper as follows:</p>
|
198 |
|
199 |
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>@inproceedings{hsiung2023nctv,
|
200 |
+
title={{NCTV: Gradient Cuff: Detecting Jailbreak Attacks on Large Language Models by Exploring Refusal Loss Landscapes}},
|
201 |
author={Lei Hsiung, Yung-Chen Tang and Pin-Yu Chen and Tsung-Yi Ho},
|
202 |
booktitle={Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence},
|
203 |
publisher={Association for the Advancement of Artificial Intelligence},
|