Spaces:
Runtime error
Runtime error
Commit
·
0a88363
1
Parent(s):
8ec4512
bold settings name descriptions
Browse files- demo_watermark.py +10 -10
demo_watermark.py
CHANGED
|
@@ -579,26 +579,26 @@ def run_gradio(args, model=None, device=None, tokenizer=None):
|
|
| 579 |
"""
|
| 580 |
#### Generation Parameters:
|
| 581 |
|
| 582 |
-
- Decoding Method : We can generate tokens from the model using either multinomial sampling or we can generate using greedy decoding.
|
| 583 |
-
- Sampling Temperature : If using multinomial sampling we can set the temperature of the sampling distribution.
|
| 584 |
0.0 is equivalent to greedy decoding, and 1.0 is the maximum amount of variability/entropy in the next token distribution.
|
| 585 |
0.7 strikes a nice balance between faithfulness to the model's estimate of top candidates while adding variety. Does not apply for greedy decoding.
|
| 586 |
-
- Generation Seed : The integer to pass to the torch random number generator before running generation. Makes the multinomial sampling strategy
|
| 587 |
outputs reproducible. Does not apply for greedy decoding.
|
| 588 |
-
- Number of Beams : When using greedy decoding, we can also set the number of beams to > 1 to enable beam search.
|
| 589 |
This is not implemented/excluded from paper for multinomial sampling but may be added in future.
|
| 590 |
-
- Max Generated Tokens : The `max_new_tokens` parameter passed to the generation method to stop the output at a certain number of new tokens.
|
| 591 |
Note that the model is free to generate fewer tokens depending on the prompt.
|
| 592 |
Implicitly this sets the maximum number of prompt tokens possible as the model's maximum input length minus `max_new_tokens`,
|
| 593 |
and inputs will be truncated accordingly.
|
| 594 |
|
| 595 |
#### Watermark Parameters:
|
| 596 |
|
| 597 |
-
- gamma : The fraction of the vocabulary to be partitioned into the greenlist at each generation step.
|
| 598 |
Smaller gamma values create a stronger watermark by enabling the watermarked model to achieve
|
| 599 |
a greater differentiation from human/unwatermarked text because it is preferentially sampling
|
| 600 |
from a smaller green set making those tokens less likely to occur by chance.
|
| 601 |
-
- delta : The amount of positive bias to add to the logits of every token in the greenlist
|
| 602 |
at each generation step before sampling/choosing the next token. Higher delta values
|
| 603 |
mean that the greenlist tokens are more heavily preferred by the watermarked model
|
| 604 |
and as the bias becomes very large the watermark transitions from "soft" to "hard".
|
|
@@ -607,7 +607,7 @@ def run_gradio(args, model=None, device=None, tokenizer=None):
|
|
| 607 |
|
| 608 |
#### Detector Parameters:
|
| 609 |
|
| 610 |
-
- z-score threshold : the z-score cuttoff for the hypothesis test. Higher thresholds (such as 4.0) make
|
| 611 |
_false positives_ (predicting that human/unwatermarked text is watermarked) very unlikely
|
| 612 |
as a genuine human text with a significant number of tokens will almost never achieve
|
| 613 |
that high of a z-score. Lower thresholds will capture more _true positives_ as some watermarked
|
|
@@ -615,11 +615,11 @@ def run_gradio(args, model=None, device=None, tokenizer=None):
|
|
| 615 |
be flagged as "watermarked". However, a lowere threshold will increase the chance that human text
|
| 616 |
that contains a slightly higher than average number of green tokens is erroneously flagged.
|
| 617 |
4.0-5.0 offers extremely low false positive rates while still accurately catching most watermarked text.
|
| 618 |
-
- Ignore Bigram Repeats : This alternate detection algorithm only considers the unique bigrams in the text during detection,
|
| 619 |
computing the greenlists based on the first in each pair and checking whether the second falls within the list.
|
| 620 |
This means that `T` is now the unique number of bigrams in the text, which becomes less than the total
|
| 621 |
number of tokens generated if the text contains a lot of repetition. See the paper for a more detailed discussion.
|
| 622 |
-
- Normalizations : we implement a few basic normaliations to defend against various adversarial perturbations of the
|
| 623 |
text analyzed during detection. Currently we support converting all chracters to unicode,
|
| 624 |
replacing homoglyphs with a canonical form, and standardizing the capitalization.
|
| 625 |
See the paper for a detailed discussion of input normalization.
|
|
|
|
| 579 |
"""
|
| 580 |
#### Generation Parameters:
|
| 581 |
|
| 582 |
+
- **Decoding Method** : We can generate tokens from the model using either multinomial sampling or we can generate using greedy decoding.
|
| 583 |
+
- **Sampling Temperature** : If using multinomial sampling we can set the temperature of the sampling distribution.
|
| 584 |
0.0 is equivalent to greedy decoding, and 1.0 is the maximum amount of variability/entropy in the next token distribution.
|
| 585 |
0.7 strikes a nice balance between faithfulness to the model's estimate of top candidates while adding variety. Does not apply for greedy decoding.
|
| 586 |
+
- **Generation Seed** : The integer to pass to the torch random number generator before running generation. Makes the multinomial sampling strategy
|
| 587 |
outputs reproducible. Does not apply for greedy decoding.
|
| 588 |
+
- **Number of Beams** : When using greedy decoding, we can also set the number of beams to > 1 to enable beam search.
|
| 589 |
This is not implemented/excluded from paper for multinomial sampling but may be added in future.
|
| 590 |
+
- **Max Generated Tokens** : The `max_new_tokens` parameter passed to the generation method to stop the output at a certain number of new tokens.
|
| 591 |
Note that the model is free to generate fewer tokens depending on the prompt.
|
| 592 |
Implicitly this sets the maximum number of prompt tokens possible as the model's maximum input length minus `max_new_tokens`,
|
| 593 |
and inputs will be truncated accordingly.
|
| 594 |
|
| 595 |
#### Watermark Parameters:
|
| 596 |
|
| 597 |
+
- **gamma** : The fraction of the vocabulary to be partitioned into the greenlist at each generation step.
|
| 598 |
Smaller gamma values create a stronger watermark by enabling the watermarked model to achieve
|
| 599 |
a greater differentiation from human/unwatermarked text because it is preferentially sampling
|
| 600 |
from a smaller green set making those tokens less likely to occur by chance.
|
| 601 |
+
- **delta** : The amount of positive bias to add to the logits of every token in the greenlist
|
| 602 |
at each generation step before sampling/choosing the next token. Higher delta values
|
| 603 |
mean that the greenlist tokens are more heavily preferred by the watermarked model
|
| 604 |
and as the bias becomes very large the watermark transitions from "soft" to "hard".
|
|
|
|
| 607 |
|
| 608 |
#### Detector Parameters:
|
| 609 |
|
| 610 |
+
- **z-score threshold** : the z-score cuttoff for the hypothesis test. Higher thresholds (such as 4.0) make
|
| 611 |
_false positives_ (predicting that human/unwatermarked text is watermarked) very unlikely
|
| 612 |
as a genuine human text with a significant number of tokens will almost never achieve
|
| 613 |
that high of a z-score. Lower thresholds will capture more _true positives_ as some watermarked
|
|
|
|
| 615 |
be flagged as "watermarked". However, a lowere threshold will increase the chance that human text
|
| 616 |
that contains a slightly higher than average number of green tokens is erroneously flagged.
|
| 617 |
4.0-5.0 offers extremely low false positive rates while still accurately catching most watermarked text.
|
| 618 |
+
- **Ignore Bigram Repeats** : This alternate detection algorithm only considers the unique bigrams in the text during detection,
|
| 619 |
computing the greenlists based on the first in each pair and checking whether the second falls within the list.
|
| 620 |
This means that `T` is now the unique number of bigrams in the text, which becomes less than the total
|
| 621 |
number of tokens generated if the text contains a lot of repetition. See the paper for a more detailed discussion.
|
| 622 |
+
- **Normalizations** : we implement a few basic normaliations to defend against various adversarial perturbations of the
|
| 623 |
text analyzed during detection. Currently we support converting all chracters to unicode,
|
| 624 |
replacing homoglyphs with a canonical form, and standardizing the capitalization.
|
| 625 |
See the paper for a detailed discussion of input normalization.
|