yaniseuranova commited on
Commit
9c3c65e
1 Parent(s): 423904b

Add SetFit model

Browse files
Files changed (5) hide show
  1. README.md +74 -37
  2. config.json +1 -1
  3. config_setfit.json +3 -1
  4. model.safetensors +1 -1
  5. model_head.pkl +2 -2
README.md CHANGED
@@ -10,14 +10,15 @@ tags:
10
  - text-classification
11
  - generated_from_setfit_trainer
12
  widget:
13
- - text: How does technology impact our daily lives and what benefits can it bring
14
- to various activities?
15
- - text: How do organizations effectively deploy and manage machine learning algorithms
16
- to drive business value?
17
- - text: What are the key considerations for organizing and managing computer lab resources
18
- and tracking their status?
19
- - text: How can batch processing improve the efficiency of data lake operations?
20
- - text: What is the purpose of setting up a CUPS on a server?
 
21
  inference: true
22
  model-index:
23
  - name: SetFit with sentence-transformers/all-MiniLM-L6-v2
@@ -31,7 +32,7 @@ model-index:
31
  split: test
32
  metrics:
33
  - type: accuracy
34
- value: 0.8947368421052632
35
  name: Accuracy
36
  ---
37
 
@@ -51,7 +52,7 @@ The model has been trained using an efficient few-shot learning technique that i
51
  - **Sentence Transformer body:** [sentence-transformers/all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2)
52
  - **Classification head:** a [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance
53
  - **Maximum Sequence Length:** 256 tokens
54
- - **Number of Classes:** 2 classes
55
  <!-- - **Training Dataset:** [Unknown](https://huggingface.co/datasets/unknown) -->
56
  <!-- - **Language:** Unknown -->
57
  <!-- - **License:** Unknown -->
@@ -63,17 +64,19 @@ The model has been trained using an efficient few-shot learning technique that i
63
  - **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit)
64
 
65
  ### Model Labels
66
- | Label | Examples |
67
- |:---------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
68
- | lexical | <ul><li>"How does Happeo's search AI work to provide answers to user queries?"</li><li>'What are the primary areas of focus in the domain of Data Science and Analysis?'</li><li>'How can one organize a running event in Belgium?'</li></ul> |
69
- | semantic | <ul><li>'What changes can be made to a channel header?'</li><li>'How can hardware capabilities impact the accuracy of motion and object detections?'</li><li>'Who is responsible for managing guarantees and prolongations?'</li></ul> |
 
 
70
 
71
  ## Evaluation
72
 
73
  ### Metrics
74
  | Label | Accuracy |
75
  |:--------|:---------|
76
- | **all** | 0.8947 |
77
 
78
  ## Uses
79
 
@@ -93,7 +96,7 @@ from setfit import SetFitModel
93
  # Download from the 🤗 Hub
94
  model = SetFitModel.from_pretrained("yaniseuranova/setfit-rag-hybrid-search-query-router-test")
95
  # Run inference
96
- preds = model("What is the purpose of setting up a CUPS on a server?")
97
  ```
98
 
99
  <!--
@@ -125,16 +128,18 @@ preds = model("What is the purpose of setting up a CUPS on a server?")
125
  ### Training Set Metrics
126
  | Training set | Min | Median | Max |
127
  |:-------------|:----|:--------|:----|
128
- | Word count | 4 | 13.7407 | 28 |
129
 
130
- | Label | Training Sample Count |
131
- |:---------|:----------------------|
132
- | lexical | 44 |
133
- | semantic | 118 |
 
 
134
 
135
  ### Training Hyperparameters
136
- - batch_size: (32, 32)
137
- - num_epochs: (1, 1)
138
  - max_steps: -1
139
  - sampling_strategy: oversampling
140
  - body_learning_rate: (2e-05, 1e-05)
@@ -150,20 +155,52 @@ preds = model("What is the purpose of setting up a CUPS on a server?")
150
  - load_best_model_at_end: True
151
 
152
  ### Training Results
153
- | Epoch | Step | Training Loss | Validation Loss |
154
- |:-------:|:-------:|:-------------:|:---------------:|
155
- | 0.0020 | 1 | 0.4064 | - |
156
- | 0.0998 | 50 | 0.2177 | - |
157
- | 0.1996 | 100 | 0.0437 | - |
158
- | 0.2994 | 150 | 0.0057 | - |
159
- | 0.3992 | 200 | 0.0034 | - |
160
- | 0.4990 | 250 | 0.0009 | - |
161
- | 0.5988 | 300 | 0.0009 | - |
162
- | 0.6986 | 350 | 0.0007 | - |
163
- | 0.7984 | 400 | 0.0007 | - |
164
- | 0.8982 | 450 | 0.0009 | - |
165
- | 0.9980 | 500 | 0.0005 | - |
166
- | **1.0** | **501** | **-** | **0.1811** |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
167
 
168
  * The bold row denotes the saved checkpoint.
169
  ### Framework Versions
 
10
  - text-classification
11
  - generated_from_setfit_trainer
12
  widget:
13
+ - text: What are the key components involved in developing a deep learning model for
14
+ handwritten digit recognition?
15
+ - text: What is the purpose of the message posted by the CR?
16
+ - text: How can researchers create and maintain public repositories for reproducible
17
+ research?
18
+ - text: What are the key components involved in developing a deep learning model for
19
+ handwritten digit recognition?
20
+ - text: How do you prioritize and delegate tasks to ensure efficient collaboration
21
+ and feedback?
22
  inference: true
23
  model-index:
24
  - name: SetFit with sentence-transformers/all-MiniLM-L6-v2
 
32
  split: test
33
  metrics:
34
  - type: accuracy
35
+ value: 0.5
36
  name: Accuracy
37
  ---
38
 
 
52
  - **Sentence Transformer body:** [sentence-transformers/all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2)
53
  - **Classification head:** a [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance
54
  - **Maximum Sequence Length:** 256 tokens
55
+ - **Number of Classes:** 4 classes
56
  <!-- - **Training Dataset:** [Unknown](https://huggingface.co/datasets/unknown) -->
57
  <!-- - **Language:** Unknown -->
58
  <!-- - **License:** Unknown -->
 
64
  - **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit)
65
 
66
  ### Model Labels
67
+ | Label | Examples |
68
+ |:--------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
69
+ | lexical | <ul><li>'What are the key considerations when choosing an optimization method for a complex problem?'</li><li>'What are the challenges of being a remote mentor or sponsor?'</li><li>'How do researchers typically obtain information on the ranking of machine learning conferences?'</li></ul> |
70
+ | semantic | <ul><li>'What are common issues that users may encounter when accessing a platform that uses JumpCloud for authentication?'</li><li>'What are the key components involved in developing a deep learning model for handwritten digit recognition?'</li><li>'How can machine learning and data enrichment be used to improve business outcomes in various industries?'</li></ul> |
71
+ | very_semantic | <ul><li>"What are people's opinions on a particular topic?"</li><li>'What are the key considerations when proposing names for a project or initiative?'</li><li>'What are the key considerations for successful collaboration between industry and academia in research and development projects?'</li></ul> |
72
+ | very_lexical | <ul><li>'How can one track and store keys in a Flink operator?'</li><li>'What role do companies like Solvay play in addressing key societal challenges through their business strategies and operations?'</li><li>'What is the purpose of the scoring methodology in determining RAI maturity?'</li></ul> |
73
 
74
  ## Evaluation
75
 
76
  ### Metrics
77
  | Label | Accuracy |
78
  |:--------|:---------|
79
+ | **all** | 0.5 |
80
 
81
  ## Uses
82
 
 
96
  # Download from the 🤗 Hub
97
  model = SetFitModel.from_pretrained("yaniseuranova/setfit-rag-hybrid-search-query-router-test")
98
  # Run inference
99
+ preds = model("What is the purpose of the message posted by the CR?")
100
  ```
101
 
102
  <!--
 
128
  ### Training Set Metrics
129
  | Training set | Min | Median | Max |
130
  |:-------------|:----|:--------|:----|
131
+ | Word count | 8 | 14.4138 | 24 |
132
 
133
+ | Label | Training Sample Count |
134
+ |:--------------|:----------------------|
135
+ | lexical | 32 |
136
+ | semantic | 21 |
137
+ | very_lexical | 10 |
138
+ | very_semantic | 24 |
139
 
140
  ### Training Hyperparameters
141
+ - batch_size: (8, 8)
142
+ - num_epochs: (3, 3)
143
  - max_steps: -1
144
  - sampling_strategy: oversampling
145
  - body_learning_rate: (2e-05, 1e-05)
 
155
  - load_best_model_at_end: True
156
 
157
  ### Training Results
158
+ | Epoch | Step | Training Loss | Validation Loss |
159
+ |:-------:|:--------:|:-------------:|:---------------:|
160
+ | 0.0015 | 1 | 0.268 | - |
161
+ | 0.0736 | 50 | 0.2649 | - |
162
+ | 0.1473 | 100 | 0.3352 | - |
163
+ | 0.2209 | 150 | 0.2516 | - |
164
+ | 0.2946 | 200 | 0.2438 | - |
165
+ | 0.3682 | 250 | 0.1808 | - |
166
+ | 0.4418 | 300 | 0.2365 | - |
167
+ | 0.5155 | 350 | 0.1337 | - |
168
+ | 0.5891 | 400 | 0.2263 | - |
169
+ | 0.6627 | 450 | 0.1936 | - |
170
+ | 0.7364 | 500 | 0.0612 | - |
171
+ | 0.8100 | 550 | 0.1664 | - |
172
+ | 0.8837 | 600 | 0.0987 | - |
173
+ | 0.9573 | 650 | 0.0736 | - |
174
+ | 1.0 | 679 | - | 0.2288 |
175
+ | 1.0309 | 700 | 0.0568 | - |
176
+ | 1.1046 | 750 | 0.0765 | - |
177
+ | 1.1782 | 800 | 0.1193 | - |
178
+ | 1.2518 | 850 | 0.199 | - |
179
+ | 1.3255 | 900 | 0.2734 | - |
180
+ | 1.3991 | 950 | 0.194 | - |
181
+ | 1.4728 | 1000 | 0.1085 | - |
182
+ | 1.5464 | 1050 | 0.1496 | - |
183
+ | 1.6200 | 1100 | 0.1673 | - |
184
+ | 1.6937 | 1150 | 0.2225 | - |
185
+ | 1.7673 | 1200 | 0.0503 | - |
186
+ | 1.8409 | 1250 | 0.1531 | - |
187
+ | 1.9146 | 1300 | 0.2287 | - |
188
+ | 1.9882 | 1350 | 0.1187 | - |
189
+ | **2.0** | **1358** | **-** | **0.2055** |
190
+ | 2.0619 | 1400 | 0.0546 | - |
191
+ | 2.1355 | 1450 | 0.2072 | - |
192
+ | 2.2091 | 1500 | 0.1208 | - |
193
+ | 2.2828 | 1550 | 0.0837 | - |
194
+ | 2.3564 | 1600 | 0.0405 | - |
195
+ | 2.4300 | 1650 | 0.1334 | - |
196
+ | 2.5037 | 1700 | 0.1458 | - |
197
+ | 2.5773 | 1750 | 0.2189 | - |
198
+ | 2.6510 | 1800 | 0.0561 | - |
199
+ | 2.7246 | 1850 | 0.1656 | - |
200
+ | 2.7982 | 1900 | 0.1351 | - |
201
+ | 2.8719 | 1950 | 0.1826 | - |
202
+ | 2.9455 | 2000 | 0.1905 | - |
203
+ | 3.0 | 2037 | - | 0.2273 |
204
 
205
  * The bold row denotes the saved checkpoint.
206
  ### Framework Versions
config.json CHANGED
@@ -1,5 +1,5 @@
1
  {
2
- "_name_or_path": "checkpoints/step_501",
3
  "architectures": [
4
  "BertModel"
5
  ],
 
1
  {
2
+ "_name_or_path": "checkpoints/step_1358",
3
  "architectures": [
4
  "BertModel"
5
  ],
config_setfit.json CHANGED
@@ -2,6 +2,8 @@
2
  "normalize_embeddings": false,
3
  "labels": [
4
  "lexical",
5
- "semantic"
 
 
6
  ]
7
  }
 
2
  "normalize_embeddings": false,
3
  "labels": [
4
  "lexical",
5
+ "semantic",
6
+ "very_lexical",
7
+ "very_semantic"
8
  ]
9
  }
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:5b08cb8ef3f4175acd6951cd1ea664172bc14585810f21c96ddec7fe51c2a3b8
3
  size 90864192
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7fac62744a83855a95a3e80c70bf8a4648a3c5a1cd0053760fa1ff330790c771
3
  size 90864192
model_head.pkl CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:9580a3d3e74febc8f840a57e62654f417272cf7d39a382095caa7babcb979f74
3
- size 3983
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a5a2800b0ffabd217138abf7b9e4a3321ce002b79f4c83251f28a4f0a7a58788
3
+ size 13367