anismahmahi commited on
Commit
f228347
1 Parent(s): 459ebcb

Add SetFit model

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false
7
+ }
README.md ADDED
@@ -0,0 +1,275 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: setfit
3
+ tags:
4
+ - setfit
5
+ - sentence-transformers
6
+ - text-classification
7
+ - generated_from_setfit_trainer
8
+ metrics:
9
+ - accuracy
10
+ widget:
11
+ - text: Guy Cecil, the former head of the Democratic Senatorial Campaign Committee
12
+ and now the boss of a leading Democratic super PAC, voiced his frustration with
13
+ the inadequacy of Franken’s apology on Twitter.
14
+ - text: Attorney Stephen Le Brocq, who operates a law firm in the North Texas area
15
+ sums up the treatment of Guyger perfectly when he says that “The affidavit isn’t
16
+ written objectively, not at the slightest.
17
+ - text: Phone This field is for validation purposes and should be left unchanged.
18
+ - text: The Twitter suspension caught me by surprise.
19
+ - text: Popular pages like The AntiMedia (2.1 million fans), The Free Thought Project
20
+ (3.1 million fans), Press for Truth (350K fans), Police the Police (1.9 million
21
+ fans), Cop Block (1.7 million fans), and Punk Rock Libertarians (125K fans) are
22
+ just a few of the ones which were unpublished.
23
+ pipeline_tag: text-classification
24
+ inference: true
25
+ model-index:
26
+ - name: SetFit
27
+ results:
28
+ - task:
29
+ type: text-classification
30
+ name: Text Classification
31
+ dataset:
32
+ name: Unknown
33
+ type: unknown
34
+ split: test
35
+ metrics:
36
+ - type: accuracy
37
+ value: 0.9987117552334943
38
+ name: Accuracy
39
+ ---
40
+
41
+ # SetFit
42
+
43
+ This is a [SetFit](https://github.com/huggingface/setfit) model that can be used for Text Classification. A OneVsRestClassifier instance is used for classification.
44
+
45
+ The model has been trained using an efficient few-shot learning technique that involves:
46
+
47
+ 1. Fine-tuning a [Sentence Transformer](https://www.sbert.net) with contrastive learning.
48
+ 2. Training a classification head with features from the fine-tuned Sentence Transformer.
49
+
50
+ ## Model Details
51
+
52
+ ### Model Description
53
+ - **Model Type:** SetFit
54
+ <!-- - **Sentence Transformer:** [Unknown](https://huggingface.co/unknown) -->
55
+ - **Classification head:** a OneVsRestClassifier instance
56
+ - **Maximum Sequence Length:** 512 tokens
57
+ - **Number of Classes:** 3 classes
58
+ <!-- - **Training Dataset:** [Unknown](https://huggingface.co/datasets/unknown) -->
59
+ <!-- - **Language:** Unknown -->
60
+ <!-- - **License:** Unknown -->
61
+
62
+ ### Model Sources
63
+
64
+ - **Repository:** [SetFit on GitHub](https://github.com/huggingface/setfit)
65
+ - **Paper:** [Efficient Few-Shot Learning Without Prompts](https://arxiv.org/abs/2209.11055)
66
+ - **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit)
67
+
68
+ ### Model Labels
69
+ | Label | Examples |
70
+ |:------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
71
+ | 2 | <ul><li>'This research group is only interested in violent extremism – according to their website.'</li><li>'No cop, anywhere, “signed up” to be murdered.'</li><li>"(Both those states are also part of today's federal lawsuit filed in the Western District of Washington.)"</li></ul> |
72
+ | 1 | <ul><li>'In the meantime, the New Mexico district attorney who failed to file for a preliminary hearing within 10 days and didn’t show up for court is vowing to pursue prosecution of these jihadis.'</li><li>'According to the Constitution, you, and you alone, are the sole head of the executive branch, and as such you are where the buck stop in making sure the laws are faithfully executed.'</li><li>'And the death of the three-year-old?'</li></ul> |
73
+ | 0 | <ul><li>'One of the Indonesian illegal aliens benefiting from her little amnesty took the hint and used the opportunity that Saris created to flee from arrest and deportation, absconding to a sanctuary church to hide from arrest.'</li><li>'So, why did Mueller focus on Manafort?'</li><li>'We had a lot of reporters in that room, many many reporters in that room and they were unable to ask questions because this guy gets up and starts, you know, doing what he’s supposed to be doing for him and for CNN and you know just shouting out questions and making statements, too."'</li></ul> |
74
+
75
+ ## Evaluation
76
+
77
+ ### Metrics
78
+ | Label | Accuracy |
79
+ |:--------|:---------|
80
+ | **all** | 0.9987 |
81
+
82
+ ## Uses
83
+
84
+ ### Direct Use for Inference
85
+
86
+ First install the SetFit library:
87
+
88
+ ```bash
89
+ pip install setfit
90
+ ```
91
+
92
+ Then you can load this model and run inference.
93
+
94
+ ```python
95
+ from setfit import SetFitModel
96
+
97
+ # Download from the 🤗 Hub
98
+ model = SetFitModel.from_pretrained("anismahmahi/doubt_repetition_with_noPropaganda_multiclass_SetFit")
99
+ # Run inference
100
+ preds = model("The Twitter suspension caught me by surprise.")
101
+ ```
102
+
103
+ <!--
104
+ ### Downstream Use
105
+
106
+ *List how someone could finetune this model on their own dataset.*
107
+ -->
108
+
109
+ <!--
110
+ ### Out-of-Scope Use
111
+
112
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
113
+ -->
114
+
115
+ <!--
116
+ ## Bias, Risks and Limitations
117
+
118
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
119
+ -->
120
+
121
+ <!--
122
+ ### Recommendations
123
+
124
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
125
+ -->
126
+
127
+ ## Training Details
128
+
129
+ ### Training Set Metrics
130
+ | Training set | Min | Median | Max |
131
+ |:-------------|:----|:--------|:----|
132
+ | Word count | 1 | 20.4272 | 109 |
133
+
134
+ | Label | Training Sample Count |
135
+ |:------|:----------------------|
136
+ | 0 | 131 |
137
+ | 1 | 129 |
138
+ | 2 | 2479 |
139
+
140
+ ### Training Hyperparameters
141
+ - batch_size: (16, 16)
142
+ - num_epochs: (2, 2)
143
+ - max_steps: -1
144
+ - sampling_strategy: oversampling
145
+ - num_iterations: 5
146
+ - body_learning_rate: (2e-05, 1e-05)
147
+ - head_learning_rate: 0.01
148
+ - loss: CosineSimilarityLoss
149
+ - distance_metric: cosine_distance
150
+ - margin: 0.25
151
+ - end_to_end: False
152
+ - use_amp: False
153
+ - warmup_proportion: 0.1
154
+ - seed: 42
155
+ - eval_max_steps: -1
156
+ - load_best_model_at_end: True
157
+
158
+ ### Training Results
159
+ | Epoch | Step | Training Loss | Validation Loss |
160
+ |:-------:|:--------:|:-------------:|:---------------:|
161
+ | 0.0006 | 1 | 0.3869 | - |
162
+ | 0.0292 | 50 | 0.3352 | - |
163
+ | 0.0584 | 100 | 0.2235 | - |
164
+ | 0.0876 | 150 | 0.1518 | - |
165
+ | 0.1168 | 200 | 0.1967 | - |
166
+ | 0.1460 | 250 | 0.1615 | - |
167
+ | 0.1752 | 300 | 0.1123 | - |
168
+ | 0.2044 | 350 | 0.1493 | - |
169
+ | 0.2336 | 400 | 0.0039 | - |
170
+ | 0.2629 | 450 | 0.0269 | - |
171
+ | 0.2921 | 500 | 0.0024 | - |
172
+ | 0.3213 | 550 | 0.0072 | - |
173
+ | 0.3505 | 600 | 0.0649 | - |
174
+ | 0.3797 | 650 | 0.0005 | - |
175
+ | 0.4089 | 700 | 0.0008 | - |
176
+ | 0.4381 | 750 | 0.0041 | - |
177
+ | 0.4673 | 800 | 0.0009 | - |
178
+ | 0.4965 | 850 | 0.0004 | - |
179
+ | 0.5257 | 900 | 0.0013 | - |
180
+ | 0.5549 | 950 | 0.0013 | - |
181
+ | 0.5841 | 1000 | 0.0066 | - |
182
+ | 0.6133 | 1050 | 0.0355 | - |
183
+ | 0.6425 | 1100 | 0.0004 | - |
184
+ | 0.6717 | 1150 | 0.0013 | - |
185
+ | 0.7009 | 1200 | 0.0003 | - |
186
+ | 0.7301 | 1250 | 0.0002 | - |
187
+ | 0.7593 | 1300 | 0.0008 | - |
188
+ | 0.7886 | 1350 | 0.0002 | - |
189
+ | 0.8178 | 1400 | 0.0002 | - |
190
+ | 0.8470 | 1450 | 0.0004 | - |
191
+ | 0.8762 | 1500 | 0.1193 | - |
192
+ | 0.9054 | 1550 | 0.0002 | - |
193
+ | 0.9346 | 1600 | 0.0002 | - |
194
+ | 0.9638 | 1650 | 0.0002 | - |
195
+ | 0.9930 | 1700 | 0.0002 | - |
196
+ | 1.0 | 1712 | - | 0.0073 |
197
+ | 1.0222 | 1750 | 0.0002 | - |
198
+ | 1.0514 | 1800 | 0.0006 | - |
199
+ | 1.0806 | 1850 | 0.0005 | - |
200
+ | 1.1098 | 1900 | 0.0001 | - |
201
+ | 1.1390 | 1950 | 0.0012 | - |
202
+ | 1.1682 | 2000 | 0.0003 | - |
203
+ | 1.1974 | 2050 | 0.0344 | - |
204
+ | 1.2266 | 2100 | 0.0038 | - |
205
+ | 1.2558 | 2150 | 0.0001 | - |
206
+ | 1.2850 | 2200 | 0.0003 | - |
207
+ | 1.3143 | 2250 | 0.0114 | - |
208
+ | 1.3435 | 2300 | 0.0001 | - |
209
+ | 1.3727 | 2350 | 0.0001 | - |
210
+ | 1.4019 | 2400 | 0.0001 | - |
211
+ | 1.4311 | 2450 | 0.0001 | - |
212
+ | 1.4603 | 2500 | 0.0005 | - |
213
+ | 1.4895 | 2550 | 0.0086 | - |
214
+ | 1.5187 | 2600 | 0.0001 | - |
215
+ | 1.5479 | 2650 | 0.0002 | - |
216
+ | 1.5771 | 2700 | 0.0001 | - |
217
+ | 1.6063 | 2750 | 0.0002 | - |
218
+ | 1.6355 | 2800 | 0.0001 | - |
219
+ | 1.6647 | 2850 | 0.0001 | - |
220
+ | 1.6939 | 2900 | 0.0001 | - |
221
+ | 1.7231 | 2950 | 0.0001 | - |
222
+ | 1.7523 | 3000 | 0.0001 | - |
223
+ | 1.7815 | 3050 | 0.0001 | - |
224
+ | 1.8107 | 3100 | 0.0 | - |
225
+ | 1.8400 | 3150 | 0.0001 | - |
226
+ | 1.8692 | 3200 | 0.0001 | - |
227
+ | 1.8984 | 3250 | 0.0001 | - |
228
+ | 1.9276 | 3300 | 0.0 | - |
229
+ | 1.9568 | 3350 | 0.0001 | - |
230
+ | 1.9860 | 3400 | 0.0002 | - |
231
+ | **2.0** | **3424** | **-** | **0.0053** |
232
+
233
+ * The bold row denotes the saved checkpoint.
234
+ ### Framework Versions
235
+ - Python: 3.10.12
236
+ - SetFit: 1.0.1
237
+ - Sentence Transformers: 2.2.2
238
+ - Transformers: 4.35.2
239
+ - PyTorch: 2.1.0+cu121
240
+ - Datasets: 2.16.1
241
+ - Tokenizers: 0.15.0
242
+
243
+ ## Citation
244
+
245
+ ### BibTeX
246
+ ```bibtex
247
+ @article{https://doi.org/10.48550/arxiv.2209.11055,
248
+ doi = {10.48550/ARXIV.2209.11055},
249
+ url = {https://arxiv.org/abs/2209.11055},
250
+ author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
251
+ keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
252
+ title = {Efficient Few-Shot Learning Without Prompts},
253
+ publisher = {arXiv},
254
+ year = {2022},
255
+ copyright = {Creative Commons Attribution 4.0 International}
256
+ }
257
+ ```
258
+
259
+ <!--
260
+ ## Glossary
261
+
262
+ *Clearly define terms in order to be accessible across audiences.*
263
+ -->
264
+
265
+ <!--
266
+ ## Model Card Authors
267
+
268
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
269
+ -->
270
+
271
+ <!--
272
+ ## Model Card Contact
273
+
274
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
275
+ -->
config.json ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "checkpoints/step_3424/",
3
+ "architectures": [
4
+ "MPNetModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "bos_token_id": 0,
8
+ "eos_token_id": 2,
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 768,
12
+ "initializer_range": 0.02,
13
+ "intermediate_size": 3072,
14
+ "layer_norm_eps": 1e-05,
15
+ "max_position_embeddings": 514,
16
+ "model_type": "mpnet",
17
+ "num_attention_heads": 12,
18
+ "num_hidden_layers": 12,
19
+ "pad_token_id": 1,
20
+ "relative_attention_num_buckets": 32,
21
+ "torch_dtype": "float32",
22
+ "transformers_version": "4.35.2",
23
+ "vocab_size": 30527
24
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "2.0.0",
4
+ "transformers": "4.7.0",
5
+ "pytorch": "1.9.0+cu102"
6
+ }
7
+ }
config_setfit.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "labels": null,
3
+ "normalize_embeddings": false
4
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d1c9c11cd13ee24b476e00ca8ffc9aee4e0e88d891e68ae234cbd587bb3fbec9
3
+ size 437967672
model_head.pkl ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:fde9c0b712d7df8551290628727b190783dbdec9cdcff5d21ca057bacde74459
3
+ size 20436
modules.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ }
14
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": {
3
+ "content": "<s>",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "cls_token": {
10
+ "content": "<s>",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "eos_token": {
17
+ "content": "</s>",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "mask_token": {
24
+ "content": "<mask>",
25
+ "lstrip": true,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "pad_token": {
31
+ "content": "<pad>",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ },
37
+ "sep_token": {
38
+ "content": "</s>",
39
+ "lstrip": false,
40
+ "normalized": false,
41
+ "rstrip": false,
42
+ "single_word": false
43
+ },
44
+ "unk_token": {
45
+ "content": "[UNK]",
46
+ "lstrip": false,
47
+ "normalized": false,
48
+ "rstrip": false,
49
+ "single_word": false
50
+ }
51
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,66 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "<s>",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "1": {
12
+ "content": "<pad>",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "2": {
20
+ "content": "</s>",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "104": {
28
+ "content": "[UNK]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "30526": {
36
+ "content": "<mask>",
37
+ "lstrip": true,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "bos_token": "<s>",
45
+ "clean_up_tokenization_spaces": true,
46
+ "cls_token": "<s>",
47
+ "do_basic_tokenize": true,
48
+ "do_lower_case": true,
49
+ "eos_token": "</s>",
50
+ "mask_token": "<mask>",
51
+ "max_length": 512,
52
+ "model_max_length": 512,
53
+ "never_split": null,
54
+ "pad_to_multiple_of": null,
55
+ "pad_token": "<pad>",
56
+ "pad_token_type_id": 0,
57
+ "padding_side": "right",
58
+ "sep_token": "</s>",
59
+ "stride": 0,
60
+ "strip_accents": null,
61
+ "tokenize_chinese_chars": true,
62
+ "tokenizer_class": "MPNetTokenizer",
63
+ "truncation_side": "right",
64
+ "truncation_strategy": "longest_first",
65
+ "unk_token": "[UNK]"
66
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff