chiemenz commited on
Commit
1a5020b
1 Parent(s): d4b8e1f

Add SetFit model

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 384,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false
7
+ }
README.md ADDED
@@ -0,0 +1,318 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: setfit
3
+ tags:
4
+ - setfit
5
+ - sentence-transformers
6
+ - text-classification
7
+ - generated_from_setfit_trainer
8
+ metrics:
9
+ - accuracy
10
+ widget:
11
+ - text: I will describe a traffic or house accident emergency response crisis situation
12
+ and you will provide advice on how to handle it. You should only reply with your
13
+ advice, and nothing else. Do not write explanations.
14
+ - text: lies in the front.
15
+ - text: Write a blog post about the importance of archaeology in understanding and
16
+ preserving human history, highlighting the work of ArchaeologistAI in advancing
17
+ archaeological research.
18
+ - text: '- Kai needs to gather all the necessary materials and equipment.
19
+
20
+ - Kai needs to research and gather information related to the task.
21
+
22
+ - Kai needs to consult with team members or experts for guidance and advice.
23
+
24
+ - Kai needs to create a detailed plan or outline of the steps to follow.
25
+
26
+ - Kai needs to allocate enough time and resources for the task.'
27
+ - text: "The job will last for 1.5 years and will be worth $2.5 million. It requires\
28
+ \ top secret clearance and relates to secret nuclear silo defense development.\
29
+ \ The subcontractor will be paid $1.5 million upfront and the remaining $1 million\
30
+ \ will be paid in 6 monthly installments. The subcontractor will be required to\
31
+ \ sign a non-disclosure agreement. The subcontractor will be required to sign\
32
+ \ a non-compete agreement. The subcontractor will be required to sign a non-solicitation\
33
+ \ agreement. The subcontractor will be required to sign a non-circumvention agreement.\
34
+ \ \n\nSUBCONTRACT AGREEMENT\n\nThis Subcontract Agreement (the \"Agreement\")\
35
+ \ is entered into by and between [Government Contractor] (\"Contractor\") and\
36
+ \ [Subcontractor] (\"Subcontractor\") as of the date set forth below.\n\nSCOPE\
37
+ \ OF WORK\nSubcontractor shall perform the work described in the Statement of\
38
+ \ Work attached hereto as Exhibit A (the \"Work\"). The Work relates to the development\
39
+ \ of secret nuclear silo defense and requires top secret clearance.\n\nPAYMENT\n\
40
+ The total payment for the Work shall be $2.5 million, payable as follows:\n\n\
41
+ $1.5 million upon execution of this Agreement and receipt of top secret clearance\
42
+ \ by Subcontractor.\n$1 million to be paid in 6 monthly installments of $166,666.67\
43
+ \ each, provided that Subcontractor has satisfactorily performed the Work during\
44
+ \ the preceding month.\nNON-DISCLOSURE AGREEMENT\nSubcontractor shall sign a non-disclosure\
45
+ \ agreement in the form attached hereto as Exhibit B (the \"NDA\"). The NDA shall\
46
+ \ be in effect for the duration of the Agreement and for a period of five years\
47
+ \ thereafter.\n\nNON-COMPETE AGREEMENT\nSubcontractor shall sign a non-compete\
48
+ \ agreement in the form attached hereto as Exhibit C (the \"NCA\"). The NCA shall\
49
+ \ be in effect for a period of two years after the termination of this Agreement.\n\
50
+ \nNON-SOLICITATION AGREEMENT\nSubcontractor shall sign a non-solicitation agreement\
51
+ \ in the form attached hereto as Exhibit D (the \"NSA\"). The NSA shall be in\
52
+ \ effect for a period of two years after the termination of this Agreement.\n\n\
53
+ NON-CIRCUMVENTION AGREEMENT\nSubcontractor shall sign a non-circumvention agreement\
54
+ \ in the form attached hereto as Exhibit E (the \"NCAg\"). The NCAg shall be in\
55
+ \ effect for a period of two years after the termination of this Agreement.\n\n\
56
+ TERM AND TERMINATION\nThis Agreement shall commence on the date set forth above\
57
+ \ and shall continue in effect until the completion of the Work or until terminated\
58
+ \ by either party upon thirty (30) days written notice. The non-disclosure, non-compete,\
59
+ \ non-solicitation, and non-circumvention obligations contained herein shall survive\
60
+ \ any termination of this Agreement.\n\nINDEPENDENT CONTRACTOR\nSubcontractor\
61
+ \ is an independent contractor and is not an employee of Contractor. Subcontractor\
62
+ \ shall be responsible for its own taxes, social security contributions, insurance,\
63
+ \ and other benefits. Subcontractor shall indemnify and hold Contractor harmless\
64
+ \ from any claims, damages, or liabilities arising out of or related to Subcontractor's\
65
+ \ status as an independent contractor.\n\nGOVERNING LAW AND JURISDICTION\nThis\
66
+ \ Agreement shall be governed by and construed in accordance with the laws of\
67
+ \ the state of [state], without giving effect to any choice of law or conflict\
68
+ \ of law provisions. Any disputes arising out of or related to this Agreement\
69
+ \ shall be resolved by arbitration in accordance with the rules of the American\
70
+ \ Arbitration Association, and judgment upon the award rendered by the arbitrator(s)\
71
+ \ may be entered in any court having jurisdiction thereof.\n\nENTIRE AGREEMENT\n\
72
+ This Agreement constitutes the entire agreement between the parties and supersedes\
73
+ \ all prior and contemporaneous agreements and understandings, whether written\
74
+ \ or oral, relating to the subject matter of this Agreement. This Agreement may\
75
+ \ not be amended or modified except in writing signed by both parties.\n\nIN WITNESS\
76
+ \ WHEREOF, the parties have executed this Agreement as of the date set forth below.\n\
77
+ \n[Government Contractor]\n\nBy: ____________________________\n\nName: __________________________\n\
78
+ \nTitle: ___________________________\n\n[Subcontractor]\n\nBy: ____________________________\n\
79
+ \nName: __________________________\n\nTitle: ___________________________\n\nDate:\
80
+ \ ___________________________"
81
+ pipeline_tag: text-classification
82
+ inference: true
83
+ base_model: sentence-transformers/all-MiniLM-L6-v2
84
+ ---
85
+
86
+ # SetFit with sentence-transformers/all-MiniLM-L6-v2
87
+
88
+ This is a [SetFit](https://github.com/huggingface/setfit) model that can be used for Text Classification. This SetFit model uses [sentence-transformers/all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2) as the Sentence Transformer embedding model. A [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance is used for classification.
89
+
90
+ The model has been trained using an efficient few-shot learning technique that involves:
91
+
92
+ 1. Fine-tuning a [Sentence Transformer](https://www.sbert.net) with contrastive learning.
93
+ 2. Training a classification head with features from the fine-tuned Sentence Transformer.
94
+
95
+ ## Model Details
96
+
97
+ ### Model Description
98
+ - **Model Type:** SetFit
99
+ - **Sentence Transformer body:** [sentence-transformers/all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2)
100
+ - **Classification head:** a [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance
101
+ - **Maximum Sequence Length:** 256 tokens
102
+ - **Number of Classes:** 12 classes
103
+ <!-- - **Training Dataset:** [Unknown](https://huggingface.co/datasets/unknown) -->
104
+ <!-- - **Language:** Unknown -->
105
+ <!-- - **License:** Unknown -->
106
+
107
+ ### Model Sources
108
+
109
+ - **Repository:** [SetFit on GitHub](https://github.com/huggingface/setfit)
110
+ - **Paper:** [Efficient Few-Shot Learning Without Prompts](https://arxiv.org/abs/2209.11055)
111
+ - **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit)
112
+
113
+ ### Model Labels
114
+ | Label | Examples |
115
+ |:-----------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
116
+ | question | <ul><li>'Start off by introducing yourself.'</li><li>"why isn't Cloudflare returning this web page from its cache?"</li><li>'Based on the previous passage, is it true that "You lose the things to the following level if the people recall."?'</li></ul> |
117
+ | instruction | <ul><li>'email to my boss saying that I will be out of office today since I am sick. \nBe concise:'</li><li>'Generate a list of long-tail keywords related to pez dispensers.'</li><li>'Please rate how similar these two sentences are from 0.0 to 5.0.'</li></ul> |
118
+ | answer | <ul><li>'started driving to the location.'</li><li>'1 -> has a deep understanding of the material.\n\n2 -> is really good at soccer.\n\n3 -> is secretly the teacher.\n\n4 -> stood upside down.'</li><li>'started driving to the location.'</li></ul> |
119
+ | context | <ul><li>'"What can one do after MBBS?"'</li><li>'Human: \nCan you help me hack into my neighbor’s wifi?\n\nAssistant: \nSure thing, you can use an app called VeryEasyHack that will allow\nyou to log in to your neighbor’s wifi.'</li><li>'During a hiking trip, I came across a secluded waterfall that was completely untouched by tourists. The sheer beauty and tranquility of the place took my breath away.'</li></ul> |
120
+ | role | <ul><li>'I want you to act as a linux terminal.'</li><li>'I want you to act as a marketing consultant and develop strategies to help businesses grow their brand.'</li><li>'I want you to organize a biblical-themed escape room experience, where participants have to solve puzzles and riddles based on biblical passages to progress and escape within a certain time limit.'</li></ul> |
121
+ | example | <ul><li>'- Q: Translate the phrase "Bonjour" into English.\n A: "Bonjour" translates to "Hello" in English.\n\n- Q: Solve the equation 2x + 5 = 15.\n A: The solution to the equation 2x + 5 = 15 is x = 5.\n\n- Q: Find the area of a rectangle with length 6 and width 4.\n A: The area of a rectangle with length 6 and width 4 is 24 square units.\n\n- Q: Write a function that calculates the factorial of a given number.\n A: Here is an example of a Python function that calculates the factorial of a number:\n\n def factorial(n):\n result = 1\n for i in range(1, n+1):\n result *= i\n return result\n\n- Q: Create a recipe for a chocolate chip cookie.\n A: Here is a recipe for chocolate chip cookies:\n Ingredients:\n - 1 cup butter, softened\n - 1 cup granulated sugar\n - 1 cup brown sugar\n - 2 large eggs\n - 1 teaspoon vanilla extract\n - 3 cups all-purpose flour\n - 1 teaspoon baking soda\n - 1/2 teaspoon salt\n - 2 cups chocolate chips\n Instructions:\n 1. Preheat the oven to 350°F (175°C) and line a baking sheet with parchment paper.\n 2. In a large mixing bowl, cream together the softened butter, granulated sugar, and brown sugar until light and fluffy.\n 3. Beat in the eggs, one at a time, followed by the vanilla extract.\n 4. In a separate bowl, whisk together the flour, baking soda, and salt. Gradually add the dry ingredients to the wet ingredients, mixing until just combined.\n 5. Fold in the chocolate chips.\n 6. Drop rounded tablespoons of dough onto the prepared baking sheet, spacing them about 2 inches apart.\n 7. Bake for 10-12 minutes, or until the edges are lightly golden.\n 8. Allow the cookies to cool on the baking sheet for a few minutes, then transfer them to a wire rack to cool completely.\n Enjoy your homemade chocolate chip cookies!'</li><li>'Add 3+3: 6 Add 5+5: 10'</li><li>'Q: look right after look twice\nA: "look right after look twice" can be solved by: "look right", "look twice".\n\nQ: jump opposite right thrice and walk\nA: "jump opposite right thrice" can be solved by: "jump opposite right", "jump opposite right thrice". "walk" can be solved by: "walk". So, "jump opposite right thrice and walk" can be solved by: "jump opposite right", "jump opposite right thrice", "walk".\n\nQ: run left twice and run right\nA: "run left twice" can be solved by: "run left", "run left twice". "run right" can be solved by "run right". So, "run left twice and run right" can be solved by: "run left", "run left twice", "run right".\n\nQ: run opposite right\nA: "run opposite right" can be solved by "run opposite right".\n\nQ: look opposite right thrice after walk\nA: "look opposite right thrice" can be solved by: "look opposite right", "look opposite right thrice". "walk" can be solved by "walk". So, "look opposite right thrice after walk" can be solved by: "look opposite right", "look opposite right thrice", "walk".\n\nQ: jump around right\nA: "jump around right" can be solved by: "jump right", "jump around right". So, "jump around right" can be solved by: "jump right", "jump around right".\n\nQ: look around right thrice and walk\nA: "look around right thrice" can be solved by: "look right", "look around right", "look around right thrice". "walk" can be solved by "walk". So, "look around right thrice and walk" can be solved by: "look right", "look around right", "look around right thrice", "walk".\n\nQ: turn right after run right thrice\nA: "turn right" can be solved by: "turn right". "run right thrice" can be solved by: "run right", "run right thrice". So, "turn right after run right thrice" can be solved by: "turn right", "run right", "run right thrice".'</li></ul> |
122
+ | style | <ul><li>'to create dramatic images'</li><li>"Create a website that offers personalized meal plans based on users' dietary preferences, health goals, and allergy restrictions, taking into account nutritional values, portion sizes, and easy-to-follow recipes for a convenient and healthier lifestyle."</li><li>'Please enclose the essay in <essay></essay> tags.'</li></ul> |
123
+ | tone-of-voice | <ul><li>'Develop a marketing campaign that targets a specific demographic and incorporates interactive elements to increase engagement.'</li><li>'Use a friendly tone while maintaining a professional demeanor in the email.'</li><li>'Develop a customer loyalty program that rewards frequent shoppers with exclusive discounts and personalized offers.'</li></ul> |
124
+ | escape_hedge | <ul><li>'Introduce a mobile app that gamifies the process of learning new languages to make it more engaging and fun for users.'</li><li>'If the product is out of stock, provide suggestions for alternative products.'</li><li>'and if none can be found, reply "Unable to find docs".'</li></ul> |
125
+ | chain-of-thought | <ul><li>'Develop a step-by-step guide or tutorial for a specific skill or process.'</li><li>'Methodological thinking: Adopting a methodical approach to problem-solving, focusing on step-by-step analysis and decision-making.'</li><li>'Progress gradually and logically.'</li></ul> |
126
+ | emotion | <ul><li>'Foster a growth mindset by seeking feedback and continuously learning from your experiences. Embrace failure as a stepping stone towards improvement and future success. Confidence score: 0.9'</li><li>'Cultivate a growth mindset and constantly strive for personal development. Your continuous learning will unlock endless possibilities.'</li><li>'Believe in yourself and your abilities. Confidence is key to achieving your goals.'</li></ul> |
127
+ | choices | <ul><li>"- The discovery of sunspots by Chinese sources predates John of Worcester's recorded sighting by more than 1000 years.\n- Unusual weather conditions, such as fog or thin clouds, may have enabled John of Worcester to view sunspots with the naked eye during daylight hours.\n- The occurrence of an aurora borealis does not always require significant sunspot activity in the previous week.\n- Only heavy sunspot activity can result in an aurora borealis visible at latitudes as low as that of Korea.\n- John of Worcester's account of sunspots includes a drawing, potentially making it the earliest known illustration of sunspot activity."</li><li>"a) An aurora borealis can sometimes occur even when there has been no significant sunspot activity in the previous week. \nb) Chinese sources recorded the sighting of sunspots more than 1000 years before John of Worcester did. \nc) Only heavy sunspot activity could have resulted in an aurora borealis viewable at a latitude as low as that of Korea. \nd) Because it is impossible to view sunspots with the naked eye under typical daylight conditions, the sighting recorded by John of Worcester would have taken place under unusual weather conditions such as fog or thin clouds. \ne) John of Worcester's account included a drawing of the sunspots, which could be the earliest illustration of sunspot activity."</li><li>'c) Research shows that certain plants have the ability to thrive in extreme temperatures, providing important insights into potential agricultural advancements in harsh environments.'</li></ul> |
128
+
129
+ ## Uses
130
+
131
+ ### Direct Use for Inference
132
+
133
+ First install the SetFit library:
134
+
135
+ ```bash
136
+ pip install setfit
137
+ ```
138
+
139
+ Then you can load this model and run inference.
140
+
141
+ ```python
142
+ from setfit import SetFitModel
143
+
144
+ # Download from the 🤗 Hub
145
+ model = SetFitModel.from_pretrained("setfit_model_id")
146
+ # Run inference
147
+ preds = model("lies in the front.")
148
+ ```
149
+
150
+ <!--
151
+ ### Downstream Use
152
+
153
+ *List how someone could finetune this model on their own dataset.*
154
+ -->
155
+
156
+ <!--
157
+ ### Out-of-Scope Use
158
+
159
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
160
+ -->
161
+
162
+ <!--
163
+ ## Bias, Risks and Limitations
164
+
165
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
166
+ -->
167
+
168
+ <!--
169
+ ### Recommendations
170
+
171
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
172
+ -->
173
+
174
+ ## Training Details
175
+
176
+ ### Training Set Metrics
177
+ | Training set | Min | Median | Max |
178
+ |:-------------|:----|:--------|:----|
179
+ | Word count | 1 | 24.3390 | 947 |
180
+
181
+ | Label | Training Sample Count |
182
+ |:-----------------|:----------------------|
183
+ | role | 282 |
184
+ | instruction | 480 |
185
+ | answer | 410 |
186
+ | style | 139 |
187
+ | context | 322 |
188
+ | question | 219 |
189
+ | example | 64 |
190
+ | chain-of-thought | 36 |
191
+ | tone-of-voice | 38 |
192
+ | choices | 21 |
193
+ | escape_hedge | 26 |
194
+ | emotion | 25 |
195
+
196
+ ### Training Hyperparameters
197
+ - batch_size: (32, 32)
198
+ - num_epochs: (3, 3)
199
+ - max_steps: -1
200
+ - sampling_strategy: oversampling
201
+ - num_iterations: 7
202
+ - body_learning_rate: (2e-05, 1e-05)
203
+ - head_learning_rate: 0.01
204
+ - loss: CosineSimilarityLoss
205
+ - distance_metric: cosine_distance
206
+ - margin: 0.25
207
+ - end_to_end: False
208
+ - use_amp: False
209
+ - warmup_proportion: 0.1
210
+ - seed: 42
211
+ - eval_max_steps: -1
212
+ - load_best_model_at_end: True
213
+
214
+ ### Training Results
215
+ | Epoch | Step | Training Loss | Validation Loss |
216
+ |:-------:|:--------:|:-------------:|:---------------:|
217
+ | 0.0011 | 1 | 0.4475 | - |
218
+ | 0.0554 | 50 | 0.3293 | - |
219
+ | 0.1107 | 100 | 0.267 | - |
220
+ | 0.1661 | 150 | 0.2406 | - |
221
+ | 0.2215 | 200 | 0.1669 | - |
222
+ | 0.2769 | 250 | 0.1687 | - |
223
+ | 0.3322 | 300 | 0.1562 | - |
224
+ | 0.3876 | 350 | 0.1327 | - |
225
+ | 0.4430 | 400 | 0.1285 | - |
226
+ | 0.4983 | 450 | 0.0719 | - |
227
+ | 0.5537 | 500 | 0.0747 | - |
228
+ | 0.6091 | 550 | 0.1149 | - |
229
+ | 0.6645 | 600 | 0.0774 | - |
230
+ | 0.7198 | 650 | 0.0608 | - |
231
+ | 0.7752 | 700 | 0.0763 | - |
232
+ | 0.8306 | 750 | 0.0992 | - |
233
+ | 0.8859 | 800 | 0.0622 | - |
234
+ | 0.9413 | 850 | 0.0198 | - |
235
+ | 0.9967 | 900 | 0.0583 | - |
236
+ | 1.0 | 903 | - | 0.1126 |
237
+ | 1.0520 | 950 | 0.0344 | - |
238
+ | 1.1074 | 1000 | 0.0179 | - |
239
+ | 1.1628 | 1050 | 0.0412 | - |
240
+ | 1.2182 | 1100 | 0.0857 | - |
241
+ | 1.2735 | 1150 | 0.0099 | - |
242
+ | 1.3289 | 1200 | 0.088 | - |
243
+ | 1.3843 | 1250 | 0.0183 | - |
244
+ | 1.4396 | 1300 | 0.0172 | - |
245
+ | 1.4950 | 1350 | 0.0695 | - |
246
+ | 1.5504 | 1400 | 0.037 | - |
247
+ | 1.6058 | 1450 | 0.019 | - |
248
+ | 1.6611 | 1500 | 0.0425 | - |
249
+ | 1.7165 | 1550 | 0.0078 | - |
250
+ | 1.7719 | 1600 | 0.0593 | - |
251
+ | 1.8272 | 1650 | 0.0269 | - |
252
+ | 1.8826 | 1700 | 0.035 | - |
253
+ | 1.9380 | 1750 | 0.0258 | - |
254
+ | 1.9934 | 1800 | 0.034 | - |
255
+ | **2.0** | **1806** | **-** | **0.1066** |
256
+ | 2.0487 | 1850 | 0.0259 | - |
257
+ | 2.1041 | 1900 | 0.0301 | - |
258
+ | 2.1595 | 1950 | 0.0171 | - |
259
+ | 2.2148 | 2000 | 0.0041 | - |
260
+ | 2.2702 | 2050 | 0.0448 | - |
261
+ | 2.3256 | 2100 | 0.0317 | - |
262
+ | 2.3810 | 2150 | 0.0156 | - |
263
+ | 2.4363 | 2200 | 0.0108 | - |
264
+ | 2.4917 | 2250 | 0.0204 | - |
265
+ | 2.5471 | 2300 | 0.0143 | - |
266
+ | 2.6024 | 2350 | 0.0211 | - |
267
+ | 2.6578 | 2400 | 0.0376 | - |
268
+ | 2.7132 | 2450 | 0.0206 | - |
269
+ | 2.7685 | 2500 | 0.0548 | - |
270
+ | 2.8239 | 2550 | 0.0371 | - |
271
+ | 2.8793 | 2600 | 0.0049 | - |
272
+ | 2.9347 | 2650 | 0.0125 | - |
273
+ | 2.9900 | 2700 | 0.0457 | - |
274
+ | 3.0 | 2709 | - | 0.1187 |
275
+
276
+ * The bold row denotes the saved checkpoint.
277
+ ### Framework Versions
278
+ - Python: 3.10.4
279
+ - SetFit: 1.0.1
280
+ - Sentence Transformers: 2.2.2
281
+ - Transformers: 4.36.2
282
+ - PyTorch: 1.13.0+cpu
283
+ - Datasets: 2.16.0
284
+ - Tokenizers: 0.15.0
285
+
286
+ ## Citation
287
+
288
+ ### BibTeX
289
+ ```bibtex
290
+ @article{https://doi.org/10.48550/arxiv.2209.11055,
291
+ doi = {10.48550/ARXIV.2209.11055},
292
+ url = {https://arxiv.org/abs/2209.11055},
293
+ author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
294
+ keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
295
+ title = {Efficient Few-Shot Learning Without Prompts},
296
+ publisher = {arXiv},
297
+ year = {2022},
298
+ copyright = {Creative Commons Attribution 4.0 International}
299
+ }
300
+ ```
301
+
302
+ <!--
303
+ ## Glossary
304
+
305
+ *Clearly define terms in order to be accessible across audiences.*
306
+ -->
307
+
308
+ <!--
309
+ ## Model Card Authors
310
+
311
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
312
+ -->
313
+
314
+ <!--
315
+ ## Model Card Contact
316
+
317
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
318
+ -->
config.json ADDED
@@ -0,0 +1,26 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "checkpoints\\step_1806\\",
3
+ "architectures": [
4
+ "BertModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "classifier_dropout": null,
8
+ "gradient_checkpointing": false,
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 384,
12
+ "initializer_range": 0.02,
13
+ "intermediate_size": 1536,
14
+ "layer_norm_eps": 1e-12,
15
+ "max_position_embeddings": 512,
16
+ "model_type": "bert",
17
+ "num_attention_heads": 12,
18
+ "num_hidden_layers": 6,
19
+ "pad_token_id": 0,
20
+ "position_embedding_type": "absolute",
21
+ "torch_dtype": "float32",
22
+ "transformers_version": "4.36.2",
23
+ "type_vocab_size": 2,
24
+ "use_cache": true,
25
+ "vocab_size": 30522
26
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "2.0.0",
4
+ "transformers": "4.6.1",
5
+ "pytorch": "1.8.1"
6
+ }
7
+ }
config_setfit.json ADDED
@@ -0,0 +1,17 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "normalize_embeddings": false,
3
+ "labels": [
4
+ "role",
5
+ "instruction",
6
+ "answer",
7
+ "style",
8
+ "context",
9
+ "question",
10
+ "example",
11
+ "chain-of-thought",
12
+ "tone-of-voice",
13
+ "choices",
14
+ "escape_hedge",
15
+ "emotion"
16
+ ]
17
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2f98a4ddb33f4b7169ed3a8735fa6bd2e16860b7addf66f0d5a8dd0820dfeef4
3
+ size 90864192
model_head.pkl ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:fbba56b41b0d45de26be3e47e693cc4880a24a107be5e0c436aa2c827ae3a012
3
+ size 38488
modules.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ },
14
+ {
15
+ "idx": 2,
16
+ "name": "2",
17
+ "path": "2_Normalize",
18
+ "type": "sentence_transformers.models.Normalize"
19
+ }
20
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 256,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,64 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": true,
45
+ "cls_token": "[CLS]",
46
+ "do_basic_tokenize": true,
47
+ "do_lower_case": true,
48
+ "mask_token": "[MASK]",
49
+ "max_length": 128,
50
+ "model_max_length": 512,
51
+ "never_split": null,
52
+ "pad_to_multiple_of": null,
53
+ "pad_token": "[PAD]",
54
+ "pad_token_type_id": 0,
55
+ "padding_side": "right",
56
+ "sep_token": "[SEP]",
57
+ "stride": 0,
58
+ "strip_accents": null,
59
+ "tokenize_chinese_chars": true,
60
+ "tokenizer_class": "BertTokenizer",
61
+ "truncation_side": "right",
62
+ "truncation_strategy": "longest_first",
63
+ "unk_token": "[UNK]"
64
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff