Corran commited on
Commit
3fa164d
1 Parent(s): bdae235

Add SetFit model

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 384,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false
7
+ }
README.md ADDED
@@ -0,0 +1,260 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: setfit
3
+ tags:
4
+ - setfit
5
+ - sentence-transformers
6
+ - text-classification
7
+ - generated_from_setfit_trainer
8
+ metrics:
9
+ - accuracy
10
+ widget:
11
+ - text: The aim of this study was to investigate the effect of diets of extreme macronutrient
12
+ composition on DIT under near physiological conditions in a respiration chamber
13
+ over the duration of a full day.
14
+ - text: It can be seen from the figure that the blue boundaries divide the spectrum
15
+ into too many areas.
16
+ - text: It may be the case that the seller commits to selling the product to the buyer
17
+ immediately after checking the order.
18
+ - text: These subjects were excluded from the study.
19
+ - text: While the chemical shift predictions that are used always have some level
20
+ of error, a key benefit of this approach is that individual errors of large magnitude
21
+ are easily identified and tolerated due to redundancy in the network of moving
22
+ peaks.
23
+ pipeline_tag: text-classification
24
+ inference: true
25
+ base_model: sentence-transformers/all-MiniLM-L6-v2
26
+ model-index:
27
+ - name: SetFit with sentence-transformers/all-MiniLM-L6-v2
28
+ results:
29
+ - task:
30
+ type: text-classification
31
+ name: Text Classification
32
+ dataset:
33
+ name: Unknown
34
+ type: unknown
35
+ split: test
36
+ metrics:
37
+ - type: accuracy
38
+ value: 0.9755555555555555
39
+ name: Accuracy
40
+ ---
41
+
42
+ # SetFit with sentence-transformers/all-MiniLM-L6-v2
43
+
44
+ This is a [SetFit](https://github.com/huggingface/setfit) model that can be used for Text Classification. This SetFit model uses [sentence-transformers/all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2) as the Sentence Transformer embedding model. A [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance is used for classification.
45
+
46
+ The model has been trained using an efficient few-shot learning technique that involves:
47
+
48
+ 1. Fine-tuning a [Sentence Transformer](https://www.sbert.net) with contrastive learning.
49
+ 2. Training a classification head with features from the fine-tuned Sentence Transformer.
50
+
51
+ ## Model Details
52
+
53
+ ### Model Description
54
+ - **Model Type:** SetFit
55
+ - **Sentence Transformer body:** [sentence-transformers/all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2)
56
+ - **Classification head:** a [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance
57
+ - **Maximum Sequence Length:** 256 tokens
58
+ - **Number of Classes:** 9 classes
59
+ <!-- - **Training Dataset:** [Unknown](https://huggingface.co/datasets/unknown) -->
60
+ <!-- - **Language:** Unknown -->
61
+ <!-- - **License:** Unknown -->
62
+
63
+ ### Model Sources
64
+
65
+ - **Repository:** [SetFit on GitHub](https://github.com/huggingface/setfit)
66
+ - **Paper:** [Efficient Few-Shot Learning Without Prompts](https://arxiv.org/abs/2209.11055)
67
+ - **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit)
68
+
69
+ ### Model Labels
70
+ | Label | Examples |
71
+ |:------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
72
+ | 1 | <ul><li>'As the results indicate, significant differences were found between the experimental group and the control group concerning the characteristics of the exploration process.'</li><li>'No significant differences were found between fallers and non-fallers with respect to height, weight, or age.'</li><li>'There was a significant difference between the 5% calcium hypochlorite group and the other groups (P<0.001).'</li></ul> |
73
+ | 2 | <ul><li>'Our study was also limited by the lack of studies that reported age and gender-specific incidence for morbidity and mortality.'</li><li>'And while quiet stance was examined here, it is important to emphasize that the use of perturbations have provided great insight into those at risk of falling, and future prospective trials which incorporate more sophisticated assessment of fall risk are certain to provide critical information on the reactive mechanics of stability and the effects of age-related degradation on individual balance strategies [25, 26] .Another limitation of this study is the dependence of self-reporting of falls, the key parameter used to stratify the elderly groups into those with recent fall history or those with a limited history of falls.'</li><li>"Because a patient's immigration status is not recorded concomitantly with hospital resource use in any hospital, state, or federal database, it is not currently possible to isolate charity care and bad debt expenditures on An additional complicating factor is the possibility that, as a result of PRWORA, hospitals may provide and bill for services as emergency services that previously were categorized as nonemergency services in order to secure Medicaid payment."</li></ul> |
74
+ | 3 | <ul><li>'An 3-(4,5-dimethylthiazol-2yl)-2,5diphenyl tetrazolium bromide assay was used to evaluate the cytotoxicity of polyplexes at a series of N/P ratios in C6 and Hep G2 cells cultured in DMEM (with 10% fetal bovine serum) according to the methods described in our previous studies.'</li><li>'A multivariate analysis using logistic regression was used to evaluate the independent role of each covariate in hospital mortality.'</li><li>'Different methods have been used in the literature for implementing and updating the routing tables using the ant approach such as AntNet [1] .'</li></ul> |
75
+ | 4 | <ul><li>'The results of this study indicate that only the right GVS interfered with mental transformation.'</li><li>'The goal of this work is to explore the effects of general relativity on TDEs occurring in eccentric nuclear disks, and to quantify the distribution of orbital elements of TDEs that originate in eccentric nuclear disks.'</li><li>'Our results may have a number of important implications to the astrophysics of relativistic plasma in general and that of PWN in particular.'</li></ul> |
76
+ | 5 | <ul><li>'The gel retardation results of polymer/pDNA complexes with increasing N/P ratios are shown in Figure 1 .'</li><li>'In line with this, it has been suggested that the drift occurs only when the observed rubber hand is congruent in terms of posture and identity with the participants unseen hand (Tsakiris and Haggard, 2005) .'</li><li>'Mortality rates have been found to be high.'</li></ul> |
77
+ | 6 | <ul><li>'In order to use the information on prior falls in the prediction algorithm, elderly subjects were divided into two groups; those with a record of self-reported recent falls (n = 24; 14.9% of total elderly group) and those who had reported no falls in the prior sixmonth period (n = 137; 85.1% of total elderly group).'</li><li>"Semi-structured interviews were conducted with four 'custodians' (people working in locations where devices were deployed)."</li><li>'Patients who had previously undergone spinal surgery were excluded from the study.'</li></ul> |
78
+ | 7 | <ul><li>'Then, the cells were incubated for 4 h, and fresh media were added to the culture for another 20 h. Then, 10 μl of sterile, filtered 3-(4,5-dimethylthiazol-2yl)-2,5diphenyl tetrazolium bromide solution in phosphate-buffered saline (PBS) (5 mg ml −1 ) was added to each well.'</li><li>'One of the key problems in this area is the identification of influential users, by targeting whom certain desirable outcomes can be achieved.'</li><li>'The paper proceeds as follows.'</li></ul> |
79
+ | 8 | <ul><li>'The main aim of this paper is to present astrophysical parameters such as reddening, distance and age of Be 8 from four colour indices, (B − V ) , (V − I) , (R − I) and (G BP -G RP ) obtained from deep CCD U BV RI and Gaia photometries.'</li><li>'A key finding of the present study was that the rapid increase in GATA4 binding activity in cardiac nuclear extracts in response to pressure overload is mediated by ET-1 but not Ang II.'</li><li>'Section II of this paper provides an overview of the Bosch DCMG system and its components.'</li></ul> |
80
+ | 9 | <ul><li>'These results provide additional support for an activating role for H3K4me3 and a silencing role for H3K27me3 as leaves age.'</li><li>'Based on this result, it may be the case that the rate of apoptosis increases after day 5. in a previous study, mirnas were found to regulate cell proliferation, cell cycle progression and migration by altering the expressions of various factors, such as MalaT1 (48) .'</li><li>'It is therefore likely that the efforts put in by many groups to unravel the spatial regulation of the bAR system will be relevant for the understanding of human disease.'</li></ul> |
81
+
82
+ ## Evaluation
83
+
84
+ ### Metrics
85
+ | Label | Accuracy |
86
+ |:--------|:---------|
87
+ | **all** | 0.9756 |
88
+
89
+ ## Uses
90
+
91
+ ### Direct Use for Inference
92
+
93
+ First install the SetFit library:
94
+
95
+ ```bash
96
+ pip install setfit
97
+ ```
98
+
99
+ Then you can load this model and run inference.
100
+
101
+ ```python
102
+ from setfit import SetFitModel
103
+
104
+ # Download from the 🤗 Hub
105
+ model = SetFitModel.from_pretrained("Corran/SciFunctions")
106
+ # Run inference
107
+ preds = model("These subjects were excluded from the study.")
108
+ ```
109
+
110
+ <!--
111
+ ### Downstream Use
112
+
113
+ *List how someone could finetune this model on their own dataset.*
114
+ -->
115
+
116
+ <!--
117
+ ### Out-of-Scope Use
118
+
119
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
120
+ -->
121
+
122
+ <!--
123
+ ## Bias, Risks and Limitations
124
+
125
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
126
+ -->
127
+
128
+ <!--
129
+ ### Recommendations
130
+
131
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
132
+ -->
133
+
134
+ ## Training Details
135
+
136
+ ### Training Set Metrics
137
+ | Training set | Min | Median | Max |
138
+ |:-------------|:----|:--------|:----|
139
+ | Word count | 5 | 26.0891 | 245 |
140
+
141
+ | Label | Training Sample Count |
142
+ |:------|:----------------------|
143
+ | 1 | 450 |
144
+ | 2 | 450 |
145
+ | 3 | 450 |
146
+ | 4 | 450 |
147
+ | 5 | 450 |
148
+ | 6 | 450 |
149
+ | 7 | 450 |
150
+ | 8 | 450 |
151
+ | 9 | 450 |
152
+
153
+ ### Training Hyperparameters
154
+ - batch_size: (75, 75)
155
+ - num_epochs: (1, 1)
156
+ - max_steps: -1
157
+ - sampling_strategy: oversampling
158
+ - num_iterations: 20
159
+ - body_learning_rate: (2e-05, 2e-05)
160
+ - head_learning_rate: 2e-05
161
+ - loss: CosineSimilarityLoss
162
+ - distance_metric: cosine_distance
163
+ - margin: 0.25
164
+ - end_to_end: False
165
+ - use_amp: False
166
+ - warmup_proportion: 0.1
167
+ - seed: 42
168
+ - eval_max_steps: -1
169
+ - load_best_model_at_end: False
170
+
171
+ ### Training Results
172
+ | Epoch | Step | Training Loss | Validation Loss |
173
+ |:------:|:----:|:-------------:|:---------------:|
174
+ | 0.0005 | 1 | 0.3763 | - |
175
+ | 0.0231 | 50 | 0.317 | - |
176
+ | 0.0463 | 100 | 0.2252 | - |
177
+ | 0.0694 | 150 | 0.189 | - |
178
+ | 0.0926 | 200 | 0.1505 | - |
179
+ | 0.1157 | 250 | 0.105 | - |
180
+ | 0.1389 | 300 | 0.1024 | - |
181
+ | 0.1620 | 350 | 0.0867 | - |
182
+ | 0.1852 | 400 | 0.0659 | - |
183
+ | 0.2083 | 450 | 0.0532 | - |
184
+ | 0.2315 | 500 | 0.0366 | - |
185
+ | 0.2546 | 550 | 0.0622 | - |
186
+ | 0.2778 | 600 | 0.0241 | - |
187
+ | 0.3009 | 650 | 0.0315 | - |
188
+ | 0.3241 | 700 | 0.025 | - |
189
+ | 0.3472 | 750 | 0.0412 | - |
190
+ | 0.3704 | 800 | 0.0274 | - |
191
+ | 0.3935 | 850 | 0.0203 | - |
192
+ | 0.4167 | 900 | 0.0302 | - |
193
+ | 0.4398 | 950 | 0.0152 | - |
194
+ | 0.4630 | 1000 | 0.0103 | - |
195
+ | 0.4861 | 1050 | 0.0102 | - |
196
+ | 0.5093 | 1100 | 0.0208 | - |
197
+ | 0.5324 | 1150 | 0.0168 | - |
198
+ | 0.5556 | 1200 | 0.0158 | - |
199
+ | 0.5787 | 1250 | 0.0045 | - |
200
+ | 0.6019 | 1300 | 0.014 | - |
201
+ | 0.625 | 1350 | 0.0061 | - |
202
+ | 0.6481 | 1400 | 0.0125 | - |
203
+ | 0.6713 | 1450 | 0.0048 | - |
204
+ | 0.6944 | 1500 | 0.0042 | - |
205
+ | 0.7176 | 1550 | 0.0055 | - |
206
+ | 0.7407 | 1600 | 0.0058 | - |
207
+ | 0.7639 | 1650 | 0.0032 | - |
208
+ | 0.7870 | 1700 | 0.0041 | - |
209
+ | 0.8102 | 1750 | 0.0042 | - |
210
+ | 0.8333 | 1800 | 0.0018 | - |
211
+ | 0.8565 | 1850 | 0.0094 | - |
212
+ | 0.8796 | 1900 | 0.0096 | - |
213
+ | 0.9028 | 1950 | 0.0043 | - |
214
+ | 0.9259 | 2000 | 0.003 | - |
215
+ | 0.9491 | 2050 | 0.0029 | - |
216
+ | 0.9722 | 2100 | 0.0016 | - |
217
+ | 0.9954 | 2150 | 0.0084 | - |
218
+
219
+ ### Framework Versions
220
+ - Python: 3.10.12
221
+ - SetFit: 1.0.1
222
+ - Sentence Transformers: 2.2.2
223
+ - Transformers: 4.35.2
224
+ - PyTorch: 2.1.0+cu121
225
+ - Datasets: 2.16.1
226
+ - Tokenizers: 0.15.0
227
+
228
+ ## Citation
229
+
230
+ ### BibTeX
231
+ ```bibtex
232
+ @article{https://doi.org/10.48550/arxiv.2209.11055,
233
+ doi = {10.48550/ARXIV.2209.11055},
234
+ url = {https://arxiv.org/abs/2209.11055},
235
+ author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
236
+ keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
237
+ title = {Efficient Few-Shot Learning Without Prompts},
238
+ publisher = {arXiv},
239
+ year = {2022},
240
+ copyright = {Creative Commons Attribution 4.0 International}
241
+ }
242
+ ```
243
+
244
+ <!--
245
+ ## Glossary
246
+
247
+ *Clearly define terms in order to be accessible across audiences.*
248
+ -->
249
+
250
+ <!--
251
+ ## Model Card Authors
252
+
253
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
254
+ -->
255
+
256
+ <!--
257
+ ## Model Card Contact
258
+
259
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
260
+ -->
config.json ADDED
@@ -0,0 +1,26 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "/root/.cache/torch/sentence_transformers/sentence-transformers_all-MiniLM-L6-v2/",
3
+ "architectures": [
4
+ "BertModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "classifier_dropout": null,
8
+ "gradient_checkpointing": false,
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 384,
12
+ "initializer_range": 0.02,
13
+ "intermediate_size": 1536,
14
+ "layer_norm_eps": 1e-12,
15
+ "max_position_embeddings": 512,
16
+ "model_type": "bert",
17
+ "num_attention_heads": 12,
18
+ "num_hidden_layers": 6,
19
+ "pad_token_id": 0,
20
+ "position_embedding_type": "absolute",
21
+ "torch_dtype": "float32",
22
+ "transformers_version": "4.35.2",
23
+ "type_vocab_size": 2,
24
+ "use_cache": true,
25
+ "vocab_size": 30522
26
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "2.0.0",
4
+ "transformers": "4.6.1",
5
+ "pytorch": "1.8.1"
6
+ }
7
+ }
config_setfit.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "normalize_embeddings": false,
3
+ "labels": null
4
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6eb2755d97ebd3d807dbdc8159e65ddf64d1f1aad05088a3a0f3a77b94dc6565
3
+ size 90864192
model_head.pkl ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:95749c9bde6ef957c53d04db9cae6d10e02dfeaa39bc14d9f8a99e8dc4eb0cda
3
+ size 28623
modules.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ },
14
+ {
15
+ "idx": 2,
16
+ "name": "2",
17
+ "path": "2_Normalize",
18
+ "type": "sentence_transformers.models.Normalize"
19
+ }
20
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 256,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": "[CLS]",
3
+ "mask_token": "[MASK]",
4
+ "pad_token": "[PAD]",
5
+ "sep_token": "[SEP]",
6
+ "unk_token": "[UNK]"
7
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,64 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": true,
45
+ "cls_token": "[CLS]",
46
+ "do_basic_tokenize": true,
47
+ "do_lower_case": true,
48
+ "mask_token": "[MASK]",
49
+ "max_length": 128,
50
+ "model_max_length": 512,
51
+ "never_split": null,
52
+ "pad_to_multiple_of": null,
53
+ "pad_token": "[PAD]",
54
+ "pad_token_type_id": 0,
55
+ "padding_side": "right",
56
+ "sep_token": "[SEP]",
57
+ "stride": 0,
58
+ "strip_accents": null,
59
+ "tokenize_chinese_chars": true,
60
+ "tokenizer_class": "BertTokenizer",
61
+ "truncation_side": "right",
62
+ "truncation_strategy": "longest_first",
63
+ "unk_token": "[UNK]"
64
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff