Xenova HF Staff commited on
Commit
65841f9
·
verified ·
1 Parent(s): fe55306

Add Transformers.js usage/sample code

Browse files
Files changed (1) hide show
  1. README.md +184 -144
README.md CHANGED
@@ -1,144 +1,184 @@
1
- ---
2
- license: apache-2.0
3
- pipeline_tag: token-classification
4
- library_name: transformers
5
- ---
6
-
7
- # OpenAI Privacy Filter
8
-
9
- OpenAI Privacy Filter is a bidirectional token-classification model for personally identifiable information (PII) detection and masking in text. It is intended for high-throughput data sanitization workflows where teams need a model that they can run on-premises that is fast, context-aware, and tunable.
10
-
11
- OpenAI Privacy Filter is pretrained autoregressively to arrive at a checkpoint with similar architecture to gpt-oss, albeit of a smaller size. We then converted that checkpoint into a bidirectional token classifier over a privacy label taxonomy, and post-trained with a supervised classification loss. (For architecture details about gpt-oss, please see the gpt-oss model card.) Instead of generating text token-by-token, this model labels an input sequence in a single forward pass, then decodes coherent spans with a constrained Viterbi procedure. For each input token, the model predicts a probability distribution over the label taxonomy which consists of 8 output categories described below.
12
-
13
- Highlights:
14
-
15
- - Permissive Apache 2.0 license: ideal for experimentation, customization, and commercial deployment.
16
- - Small size: Runs in a web browser or on a laptop – 1.5B parameters total and 50M active parameters.
17
- - Fine-tunable: Adapt the model to specific data distributions through easy and data efficient finetuning.
18
- - Long-context: 128,000-token context window enables processing long text with high throughput and no chunking.
19
- - Runtime control: configure precision/recall tradeoffs and detected span lengths through preset operating points.
20
-
21
- ## Using This Model With Transformers
22
-
23
- 1. Using the `pipeline` API:
24
-
25
- ```py
26
- from transformers import pipeline
27
-
28
- classifier = pipeline(
29
- task="token-classification",
30
- model="openai/privacy-filter",
31
- )
32
- classifier("My name is Alice Smith")
33
- ```
34
-
35
- 2. Using as `AutoModelForTokenClassification` model:
36
-
37
- ```py
38
- import torch
39
- from transformers import AutoModelForTokenClassification, AutoTokenizer
40
-
41
- tokenizer = AutoTokenizer.from_pretrained("openai/privacy-filter")
42
- model = AutoModelForTokenClassification.from_pretrained("openai/privacy-filter", device_map="auto")
43
-
44
- inputs = tokenizer("My name is Alice Smith", return_tensors="pt").to(model.device)
45
-
46
- with torch.no_grad():
47
- outputs = model(**inputs)
48
-
49
- predicted_token_class_ids = outputs.logits.argmax(dim=-1)
50
- predicted_token_classes = [model.config.id2label[token_id.item()] for token_id in predicted_token_class_ids[0]]
51
- print(predicted_token_classes)
52
- ```
53
-
54
- ## Model Details
55
-
56
- ### Model Description
57
-
58
- Privacy Filter is a bidirectional token classification model with span decoding. It is trained in phases, beginning with autoregressive pretraining. The pretrained language model is then modified and post-trained as a bidirectional banded attention token classifier with band size 128 (effective attention window: 257 tokens including self). This means:
59
-
60
- * The base model is an autoregressive pretrained checkpoint.
61
- * The language-model output head is replaced with a token-classification head over privacy labels.
62
- * Post-training is supervised token-level classification rather than next-token prediction.
63
- * Inference applies constrained sequence decoding to produce coherent BIOES (Begin, Inside, Outside, End, Single) span labels.
64
-
65
- Architecturally, the implementation in this repo is a pre-norm transformer encoder-style stack with:
66
-
67
- * token embeddings
68
- * 8 repeated transformer blocks
69
- * grouped-query attention with rotary positional embeddings, with 14 query heads and 2 KV heads (group size = 7 queries per KV head)
70
- * sparse mixture-of-experts feed-forward blocks with 128 experts total (top-4 routing per token)
71
- * a final token-classification head over privacy labels (rather than natural language vocabulary tokens), with residual stream width `d_model = 640`.
72
-
73
- Relative to iterative autoregressive approaches, this design allows all tokens to be labeled in one pass, which improves throughput. Relative to classical masked-language-model pretraining approaches, this is a post-training conversion of an autoregressive model rather than a native masked-LM setup.
74
-
75
- ### Output Shape
76
-
77
- Privacy Filter can detect 8 privacy span categories:
78
-
79
- 1. `account_number`
80
- 2. `private_address`
81
- 3. `private_email`
82
- 4. `private_person`
83
- 5. `private_phone`
84
- 6. `private_url`
85
- 7. `private_date`
86
- 8. `secret`
87
-
88
- To perform token-classification, each non-background span category is expanded into boundary-tagged token classes: `B-<label>`, `I-<label>`, `E-<label>`, `S-<label>`, plus the background class, `O`. So the total number of token-level output classes is 33: 1 background class \+ 8 span labels \* 4 boundary tags \= 33 classes. This means the output head emits 33 logits for each token. For a sequence of length T, the output has shape `[T, 33]`; for a batch of size B, it has shape `[B, T, 33]`.
89
-
90
- The token-label vocabulary consists of the background label `O` plus BIOES-tagged variants of each privacy category: `account_number`, `private_address`, `private_email`, `private_person`, `private_phone`, `private_url`, `private_date`, and `secret`. In other words, for each category, the model predicts `B-`, `I-`, `E-`, and `S-` forms corresponding to begin, inside, end, and single-token spans. At inference time, these per-token logits are decoded into coherent BIOES span labels using constrained sequence decoding.
91
-
92
- ### Sequence Decoding Rationale and Calibration
93
-
94
- #### Rationale
95
-
96
- After the token classifier produces per-token logits, we decode labels with a constrained Viterbi decoder using linear-chain transition scoring, rather than taking an independent argmax for each token. The decoder enforces allowed BIOES boundary transitions and scores complete label paths with start, transition, and end terms, plus six transition-bias parameters that control background persistence, span entry, span continuation, span closure, and boundary-to-boundary handoff. This global path optimization is intended to improve span coherence and boundary stability by making each token decision depend on sequence-level structure, not just local logits, especially in noisy or mixed-format text where local token decisions alone can produce fragmented or inconsistent boundaries.
97
-
98
- #### Operating-Point Calibration
99
-
100
- Sequence Decoding parameters can discourage staying in background while encouraging span entry and continuation, yielding broader and more contiguous masking for improved recall, or vice versa for improved precision. At runtime, users can tune parameters that control this tradeoff.
101
-
102
- ### Model Metadata
103
-
104
- - Developed by: OpenAI
105
- - Funded by: OpenAI
106
- - Shared by: OpenAI
107
- - Model type: Bidirectional token classification model for privacy span detection
108
- - Language(s): Primarily English; selected multilingual robustness evaluation reported
109
- - License: [Apache 2.0](LICENSE)
110
-
111
- - Source repository: https://github.com/openai/privacy-filter
112
- - Demo: https://huggingface.co/spaces/openai/privacy-filter
113
- - Model card: [OpenAI Privacy Filter Model Card](https://cdn.openai.com/pdf/c66281ed-b638-456a-8ce1-97e9f5264a90/OpenAI-Privacy-Filter-Model-Card.pdf)
114
-
115
- ## Bias, Risks, and Limitations
116
-
117
- ### Risk: Over-reliance
118
-
119
- Privacy Filter is a redaction and data minimization aid, not an anonymization, compliance, or a safety guarantee. Over-reliance on the tool as a blanket anonymization claim would risk missing desired privacy objectives. Privacy Filter is best used as one of multiple layers in a holistic end-to-end privacy-by-design approach.
120
-
121
- ### Limitation: Static Label Policy
122
-
123
- The model will only identify personal data spans that match the trained label taxonomy and definitions. Real-life privacy use cases are varied and complex and definitions of appropriate label policies and decision boundaries can differ. Thus model defaults may not satisfy organization-specific governance requirements without calibration/fine-tuning.
124
-
125
- Privacy Filter does not support configuring label policies dynamically at runtime; instead changing policies requires further finetuning of the model. The native label set and associated decision boundaries may not be appropriate for every use case. For example, the model's training policy aims to prioritize personal identifiers, often preserving context that is not strongly person-linked by design; some users may want to adjust this choice.
126
-
127
- Performance may drop on non-English text, non-Latin scripts, protected-group naming patterns, or domains that are out of distribution compared to model training.
128
-
129
- ### Failure Modes
130
-
131
- Like all models, Privacy Filter can make mistakes, such as: under-detection of uncommon personal names, regional naming conventions, initials, honorific-heavy references, or domain-specific identifiers; over-redaction of public entities, organizations, locations, or common nouns when local context is ambiguous; fragmented or shifted span boundaries in mixed-format text, long documents, or text with heavy punctuation and layout artifacts; missed secrets for novel credential formats, project-specific token patterns, or secrets split across surrounding syntax; and over-redaction of benign high-entropy strings, placeholders, hashes, sample credentials, or synthetic examples that resemble secrets.
132
-
133
- These limitations can interact with demographic, regional, and domain variation. For example, names and identifiers that are underrepresented in training data, or that follow conventions different from the dominant training distribution, may be more likely to be missed or inconsistently bounded.
134
-
135
- ### High-Risk Deployment Caution
136
-
137
- Additional caution is warranted in high-sensitivity settings such as medical, legal, financial, human resources, education, and government workflows. In these settings, both false negatives and false positives can be costly: missed spans may expose sensitive information, while excess masking can remove material context needed for review, auditing, or downstream decision-making.
138
-
139
- ### Recommendations
140
-
141
- - Use Privacy Filter as part of a holistic privacy-by-design approach, not as a blanket anonymization claim.
142
- - Evaluate in-domain with local policy references before production.
143
- - Use task-specific fine-tuning when policy differs from base boundaries.
144
- - Keep human review paths for high-sensitivity workflows.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ pipeline_tag: token-classification
4
+ library_name: transformers
5
+ tags:
6
+ - transformers.js
7
+ ---
8
+
9
+ # OpenAI Privacy Filter
10
+
11
+ OpenAI Privacy Filter is a bidirectional token-classification model for personally identifiable information (PII) detection and masking in text. It is intended for high-throughput data sanitization workflows where teams need a model that they can run on-premises that is fast, context-aware, and tunable.
12
+
13
+ OpenAI Privacy Filter is pretrained autoregressively to arrive at a checkpoint with similar architecture to gpt-oss, albeit of a smaller size. We then converted that checkpoint into a bidirectional token classifier over a privacy label taxonomy, and post-trained with a supervised classification loss. (For architecture details about gpt-oss, please see the gpt-oss model card.) Instead of generating text token-by-token, this model labels an input sequence in a single forward pass, then decodes coherent spans with a constrained Viterbi procedure. For each input token, the model predicts a probability distribution over the label taxonomy which consists of 8 output categories described below.
14
+
15
+ Highlights:
16
+
17
+ - Permissive Apache 2.0 license: ideal for experimentation, customization, and commercial deployment.
18
+ - Small size: Runs in a web browser or on a laptop 1.5B parameters total and 50M active parameters.
19
+ - Fine-tunable: Adapt the model to specific data distributions through easy and data efficient finetuning.
20
+ - Long-context: 128,000-token context window enables processing long text with high throughput and no chunking.
21
+ - Runtime control: configure precision/recall tradeoffs and detected span lengths through preset operating points.
22
+
23
+ ## Usage
24
+
25
+ ### Transformers
26
+
27
+ 1. Using the `pipeline` API:
28
+
29
+ ```py
30
+ from transformers import pipeline
31
+
32
+ classifier = pipeline(
33
+ task="token-classification",
34
+ model="openai/privacy-filter",
35
+ )
36
+ classifier("My name is Alice Smith")
37
+ ```
38
+
39
+ 2. Using as `AutoModelForTokenClassification` model:
40
+
41
+ ```py
42
+ import torch
43
+ from transformers import AutoModelForTokenClassification, AutoTokenizer
44
+
45
+ tokenizer = AutoTokenizer.from_pretrained("openai/privacy-filter")
46
+ model = AutoModelForTokenClassification.from_pretrained("openai/privacy-filter", device_map="auto")
47
+
48
+ inputs = tokenizer("My name is Alice Smith", return_tensors="pt").to(model.device)
49
+
50
+ with torch.no_grad():
51
+ outputs = model(**inputs)
52
+
53
+ predicted_token_class_ids = outputs.logits.argmax(dim=-1)
54
+ predicted_token_classes = [model.config.id2label[token_id.item()] for token_id in predicted_token_class_ids[0]]
55
+ print(predicted_token_classes)
56
+ ```
57
+
58
+ ### Transformers.js
59
+
60
+ 1. Using the `pipeline` API:
61
+
62
+ ```js
63
+ import { pipeline } from "@huggingface/transformers";
64
+
65
+ const classifier = await pipeline(
66
+ "token-classification", "openai/privacy-filter",
67
+ { device: "webgpu", dtype: "q4" },
68
+ );
69
+
70
+ const input = "My name is Harry Potter and my email is harry.potter@hogwarts.edu.";
71
+ const output = await classifier(input, { aggregation_strategy: "simple" });
72
+ console.dir(output, { depth: null });
73
+ ```
74
+
75
+ <details>
76
+ <summary>See example output</summary>
77
+
78
+ ```js
79
+ [
80
+ {
81
+ entity_group: 'private_person',
82
+ score: 0.9999957978725433,
83
+ word: ' Harry Potter'
84
+ },
85
+ {
86
+ entity_group: 'private_email',
87
+ score: 0.9999990728166368,
88
+ word: ' harry.potter@hogwarts.edu'
89
+ }
90
+ ]
91
+ ```
92
+ </details>
93
+
94
+ ## Model Details
95
+
96
+ ### Model Description
97
+
98
+ Privacy Filter is a bidirectional token classification model with span decoding. It is trained in phases, beginning with autoregressive pretraining. The pretrained language model is then modified and post-trained as a bidirectional banded attention token classifier with band size 128 (effective attention window: 257 tokens including self). This means:
99
+
100
+ * The base model is an autoregressive pretrained checkpoint.
101
+ * The language-model output head is replaced with a token-classification head over privacy labels.
102
+ * Post-training is supervised token-level classification rather than next-token prediction.
103
+ * Inference applies constrained sequence decoding to produce coherent BIOES (Begin, Inside, Outside, End, Single) span labels.
104
+
105
+ Architecturally, the implementation in this repo is a pre-norm transformer encoder-style stack with:
106
+
107
+ * token embeddings
108
+ * 8 repeated transformer blocks
109
+ * grouped-query attention with rotary positional embeddings, with 14 query heads and 2 KV heads (group size = 7 queries per KV head)
110
+ * sparse mixture-of-experts feed-forward blocks with 128 experts total (top-4 routing per token)
111
+ * a final token-classification head over privacy labels (rather than natural language vocabulary tokens), with residual stream width `d_model = 640`.
112
+
113
+ Relative to iterative autoregressive approaches, this design allows all tokens to be labeled in one pass, which improves throughput. Relative to classical masked-language-model pretraining approaches, this is a post-training conversion of an autoregressive model rather than a native masked-LM setup.
114
+
115
+ ### Output Shape
116
+
117
+ Privacy Filter can detect 8 privacy span categories:
118
+
119
+ 1. `account_number`
120
+ 2. `private_address`
121
+ 3. `private_email`
122
+ 4. `private_person`
123
+ 5. `private_phone`
124
+ 6. `private_url`
125
+ 7. `private_date`
126
+ 8. `secret`
127
+
128
+ To perform token-classification, each non-background span category is expanded into boundary-tagged token classes: `B-<label>`, `I-<label>`, `E-<label>`, `S-<label>`, plus the background class, `O`. So the total number of token-level output classes is 33: 1 background class \+ 8 span labels \* 4 boundary tags \= 33 classes. This means the output head emits 33 logits for each token. For a sequence of length T, the output has shape `[T, 33]`; for a batch of size B, it has shape `[B, T, 33]`.
129
+
130
+ The token-label vocabulary consists of the background label `O` plus BIOES-tagged variants of each privacy category: `account_number`, `private_address`, `private_email`, `private_person`, `private_phone`, `private_url`, `private_date`, and `secret`. In other words, for each category, the model predicts `B-`, `I-`, `E-`, and `S-` forms corresponding to begin, inside, end, and single-token spans. At inference time, these per-token logits are decoded into coherent BIOES span labels using constrained sequence decoding.
131
+
132
+ ### Sequence Decoding Rationale and Calibration
133
+
134
+ #### Rationale
135
+
136
+ After the token classifier produces per-token logits, we decode labels with a constrained Viterbi decoder using linear-chain transition scoring, rather than taking an independent argmax for each token. The decoder enforces allowed BIOES boundary transitions and scores complete label paths with start, transition, and end terms, plus six transition-bias parameters that control background persistence, span entry, span continuation, span closure, and boundary-to-boundary handoff. This global path optimization is intended to improve span coherence and boundary stability by making each token decision depend on sequence-level structure, not just local logits, especially in noisy or mixed-format text where local token decisions alone can produce fragmented or inconsistent boundaries.
137
+
138
+ #### Operating-Point Calibration
139
+
140
+ Sequence Decoding parameters can discourage staying in background while encouraging span entry and continuation, yielding broader and more contiguous masking for improved recall, or vice versa for improved precision. At runtime, users can tune parameters that control this tradeoff.
141
+
142
+ ### Model Metadata
143
+
144
+ - Developed by: OpenAI
145
+ - Funded by: OpenAI
146
+ - Shared by: OpenAI
147
+ - Model type: Bidirectional token classification model for privacy span detection
148
+ - Language(s): Primarily English; selected multilingual robustness evaluation reported
149
+ - License: [Apache 2.0](LICENSE)
150
+
151
+ - Source repository: https://github.com/openai/privacy-filter
152
+ - Demo: https://huggingface.co/spaces/openai/privacy-filter
153
+ - Model card: [OpenAI Privacy Filter Model Card](https://cdn.openai.com/pdf/c66281ed-b638-456a-8ce1-97e9f5264a90/OpenAI-Privacy-Filter-Model-Card.pdf)
154
+
155
+ ## Bias, Risks, and Limitations
156
+
157
+ ### Risk: Over-reliance
158
+
159
+ Privacy Filter is a redaction and data minimization aid, not an anonymization, compliance, or a safety guarantee. Over-reliance on the tool as a blanket anonymization claim would risk missing desired privacy objectives. Privacy Filter is best used as one of multiple layers in a holistic end-to-end privacy-by-design approach.
160
+
161
+ ### Limitation: Static Label Policy
162
+
163
+ The model will only identify personal data spans that match the trained label taxonomy and definitions. Real-life privacy use cases are varied and complex and definitions of appropriate label policies and decision boundaries can differ. Thus model defaults may not satisfy organization-specific governance requirements without calibration/fine-tuning.
164
+
165
+ Privacy Filter does not support configuring label policies dynamically at runtime; instead changing policies requires further finetuning of the model. The native label set and associated decision boundaries may not be appropriate for every use case. For example, the model's training policy aims to prioritize personal identifiers, often preserving context that is not strongly person-linked by design; some users may want to adjust this choice.
166
+
167
+ Performance may drop on non-English text, non-Latin scripts, protected-group naming patterns, or domains that are out of distribution compared to model training.
168
+
169
+ ### Failure Modes
170
+
171
+ Like all models, Privacy Filter can make mistakes, such as: under-detection of uncommon personal names, regional naming conventions, initials, honorific-heavy references, or domain-specific identifiers; over-redaction of public entities, organizations, locations, or common nouns when local context is ambiguous; fragmented or shifted span boundaries in mixed-format text, long documents, or text with heavy punctuation and layout artifacts; missed secrets for novel credential formats, project-specific token patterns, or secrets split across surrounding syntax; and over-redaction of benign high-entropy strings, placeholders, hashes, sample credentials, or synthetic examples that resemble secrets.
172
+
173
+ These limitations can interact with demographic, regional, and domain variation. For example, names and identifiers that are underrepresented in training data, or that follow conventions different from the dominant training distribution, may be more likely to be missed or inconsistently bounded.
174
+
175
+ ### High-Risk Deployment Caution
176
+
177
+ Additional caution is warranted in high-sensitivity settings such as medical, legal, financial, human resources, education, and government workflows. In these settings, both false negatives and false positives can be costly: missed spans may expose sensitive information, while excess masking can remove material context needed for review, auditing, or downstream decision-making.
178
+
179
+ ### Recommendations
180
+
181
+ - Use Privacy Filter as part of a holistic privacy-by-design approach, not as a blanket anonymization claim.
182
+ - Evaluate in-domain with local policy references before production.
183
+ - Use task-specific fine-tuning when policy differs from base boundaries.
184
+ - Keep human review paths for high-sensitivity workflows.