yaniseuranova commited on
Commit
47db620
1 Parent(s): 7f18162

Add SetFit model

Browse files
Files changed (5) hide show
  1. README.md +43 -40
  2. config.json +1 -1
  3. model.safetensors +1 -1
  4. model_head.pkl +1 -1
  5. sentencepiece.bpe.model +3 -0
README.md CHANGED
@@ -9,16 +9,16 @@ base_model: BAAI/bge-m3
9
  metrics:
10
  - accuracy
11
  widget:
12
- - text: What is the primary difference between a Bayesian neural network and a traditional
13
- feedforward neural network in the context of machine learning?
14
- - text: What is the difference betweensupervised and unsupervised machine learning
15
- algorithms in terms of data labeling and model training?
16
- - text: What is the primary application of Natural Language Processing (NLP) in Google's
17
- BERT language model, and how does it utilize masked language modeling to improve
18
- contextual understanding?
19
- - text: What is the main advantage of using GraphQL over traditional RESTful APIs,
20
- as demonstrated by social media giant Facebook in their Facebook ADS API?
21
- - text: Qui est Robin Mancini ?
22
  pipeline_tag: text-classification
23
  inference: true
24
  model-index:
@@ -65,10 +65,10 @@ The model has been trained using an efficient few-shot learning technique that i
65
  - **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit)
66
 
67
  ### Model Labels
68
- | Label | Examples |
69
- |:---------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
70
- | lexical | <ul><li>'What is the definition of semantics in the context ofontology-based data integration, and how does it differ from outright data normalization, as implementented in graph databases like neo4j orAmazon Neptune?'</li><li>'What is the primary application of graph convolutional neural networks (GCNNs) in natural language processing (NLP) for modeling syntactic dependencies in parsing?'</li><li>"What is the distinguising feature of Apache Hive's Metadata Tables, used for maintaining and managingtables in Hadoop Distributed File System (HDFS)?"</li></ul> |
71
- | semantic | <ul><li>'What is a key challenge faced by managers in sustaining a work culture that encourages creativity, innovation, and critical thinking within the technological industry globally?'</li><li>'How might shifting societal values influence the dynamics between multinational corporations and governments, leading to Changes in the global economic landscape?'</li><li>'How does the allocation of limited resources affect the allocation of decision-making power within an organization?'</li></ul> |
72
 
73
  ## Evaluation
74
 
@@ -95,7 +95,7 @@ from setfit import SetFitModel
95
  # Download from the 🤗 Hub
96
  model = SetFitModel.from_pretrained("yaniseuranova/setfit-paraphrase-mpnet-base-v2-sst2")
97
  # Run inference
98
- preds = model("Qui est Robin Mancini ?")
99
  ```
100
 
101
  <!--
@@ -127,12 +127,12 @@ preds = model("Qui est Robin Mancini ?")
127
  ### Training Set Metrics
128
  | Training set | Min | Median | Max |
129
  |:-------------|:----|:--------|:----|
130
- | Word count | 4 | 19.1392 | 56 |
131
 
132
  | Label | Training Sample Count |
133
  |:---------|:----------------------|
134
- | lexical | 36 |
135
- | semantic | 43 |
136
 
137
  ### Training Hyperparameters
138
  - batch_size: (16, 16)
@@ -154,27 +154,30 @@ preds = model("Qui est Robin Mancini ?")
154
  ### Training Results
155
  | Epoch | Step | Training Loss | Validation Loss |
156
  |:-------:|:-------:|:-------------:|:---------------:|
157
- | 0.0050 | 1 | 0.1549 | - |
158
- | 0.2475 | 50 | 0.0045 | - |
159
- | 0.4950 | 100 | 0.0009 | - |
160
- | 0.7426 | 150 | 0.0005 | - |
161
- | 0.9901 | 200 | 0.0005 | - |
162
- | 1.0 | 202 | - | 0.0001 |
163
- | 1.2376 | 250 | 0.0006 | - |
164
- | 1.4851 | 300 | 0.0006 | - |
165
- | 1.7327 | 350 | 0.0005 | - |
166
- | 1.9802 | 400 | 0.0004 | - |
167
- | 2.0 | 404 | - | 0.0 |
168
- | 2.2277 | 450 | 0.0003 | - |
169
- | 2.4752 | 500 | 0.0003 | - |
170
- | 2.7228 | 550 | 0.0003 | - |
171
- | 2.9703 | 600 | 0.0003 | - |
172
- | **3.0** | **606** | **-** | **0.0** |
173
- | 3.2178 | 650 | 0.0003 | - |
174
- | 3.4653 | 700 | 0.0004 | - |
175
- | 3.7129 | 750 | 0.0003 | - |
176
- | 3.9604 | 800 | 0.0002 | - |
177
- | 4.0 | 808 | - | 0.0 |
 
 
 
178
 
179
  * The bold row denotes the saved checkpoint.
180
  ### Framework Versions
@@ -182,7 +185,7 @@ preds = model("Qui est Robin Mancini ?")
182
  - SetFit: 1.0.3
183
  - Sentence Transformers: 2.6.1
184
  - Transformers: 4.39.0
185
- - PyTorch: 2.3.0+cu121
186
  - Datasets: 2.18.0
187
  - Tokenizers: 0.15.2
188
 
 
9
  metrics:
10
  - accuracy
11
  widget:
12
+ - text: How doCompaniesbalanceIndividualCreativitywithTeamCollaboration to driveInnovationinthe
13
+ WORKPlace?
14
+ - text: How do the values of a learning organization impact its ability to innovate
15
+ and respond to constant change?
16
+ - text: What is the primary function of the Domain Name System (DNS) layer in the
17
+ Internet Protocol Stack, as defined by ICANN?
18
+ - text: What distinguishes a transforming industry from one that merely innovates
19
+ to existing practices?
20
+ - text: How can artificial intelligence systems balance individual autonomy with collective
21
+ responsibility in decision-making processes?
22
  pipeline_tag: text-classification
23
  inference: true
24
  model-index:
 
65
  - **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit)
66
 
67
  ### Model Labels
68
+ | Label | Examples |
69
+ |:---------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
70
+ | lexical | <ul><li>'What is the primary function of the Apache Kafka distributed streaming platform in Big Data processing?'</li><li>"What is the primary difference between Hadoop's FileSystem-based architecture and Apache Cassandra's distributed, masterlessArchitecture in scale-out design?"</li><li>'What is the main difference between optimistic concurrency control and pessimistic concurrency control in database management systems?'</li></ul> |
71
+ | semantic | <ul><li>"How does organizational morale impact the competitiveness of a company in today's fast-paced market?"</li><li>'How do organizations balance individual creativity with collective goal achievement in a dynamic environment?'</li><li>'What is a key challenge faced by managers in sustaining a work culture that encourages creativity, innovation, and critical thinking within the technological industry globally?'</li></ul> |
72
 
73
  ## Evaluation
74
 
 
95
  # Download from the 🤗 Hub
96
  model = SetFitModel.from_pretrained("yaniseuranova/setfit-paraphrase-mpnet-base-v2-sst2")
97
  # Run inference
98
+ preds = model("What distinguishes a transforming industry from one that merely innovates to existing practices?")
99
  ```
100
 
101
  <!--
 
127
  ### Training Set Metrics
128
  | Training set | Min | Median | Max |
129
  |:-------------|:----|:--------|:----|
130
+ | Word count | 4 | 19.1839 | 42 |
131
 
132
  | Label | Training Sample Count |
133
  |:---------|:----------------------|
134
+ | lexical | 43 |
135
+ | semantic | 44 |
136
 
137
  ### Training Hyperparameters
138
  - batch_size: (16, 16)
 
154
  ### Training Results
155
  | Epoch | Step | Training Loss | Validation Loss |
156
  |:-------:|:-------:|:-------------:|:---------------:|
157
+ | 0.0041 | 1 | 0.2391 | - |
158
+ | 0.2066 | 50 | 0.0033 | - |
159
+ | 0.4132 | 100 | 0.0007 | - |
160
+ | 0.6198 | 150 | 0.0007 | - |
161
+ | 0.8264 | 200 | 0.0007 | - |
162
+ | **1.0** | **242** | **-** | **0.0001** |
163
+ | 1.0331 | 250 | 0.0005 | - |
164
+ | 1.2397 | 300 | 0.0004 | - |
165
+ | 1.4463 | 350 | 0.0004 | - |
166
+ | 1.6529 | 400 | 0.0003 | - |
167
+ | 1.8595 | 450 | 0.0004 | - |
168
+ | 2.0 | 484 | - | 0.0001 |
169
+ | 2.0661 | 500 | 0.0003 | - |
170
+ | 2.2727 | 550 | 0.0003 | - |
171
+ | 2.4793 | 600 | 0.0002 | - |
172
+ | 2.6860 | 650 | 0.0003 | - |
173
+ | 2.8926 | 700 | 0.0002 | - |
174
+ | 3.0 | 726 | - | 0.0001 |
175
+ | 3.0992 | 750 | 0.0003 | - |
176
+ | 3.3058 | 800 | 0.0002 | - |
177
+ | 3.5124 | 850 | 0.0002 | - |
178
+ | 3.7190 | 900 | 0.0002 | - |
179
+ | 3.9256 | 950 | 0.0003 | - |
180
+ | 4.0 | 968 | - | 0.0001 |
181
 
182
  * The bold row denotes the saved checkpoint.
183
  ### Framework Versions
 
185
  - SetFit: 1.0.3
186
  - Sentence Transformers: 2.6.1
187
  - Transformers: 4.39.0
188
+ - PyTorch: 2.3.1+cu121
189
  - Datasets: 2.18.0
190
  - Tokenizers: 0.15.2
191
 
config.json CHANGED
@@ -1,5 +1,5 @@
1
  {
2
- "_name_or_path": "checkpoints/step_606",
3
  "architectures": [
4
  "XLMRobertaModel"
5
  ],
 
1
  {
2
+ "_name_or_path": "checkpoints/step_242",
3
  "architectures": [
4
  "XLMRobertaModel"
5
  ],
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:b6b631443242ee9cb7eaa44b335a5b8a0932d0f7730c1e523a2972f095dd5fe6
3
  size 2271064456
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b1b888990ee5269d1f4c3795f8aeeb46da209d188e03543ea23b7fa884aaf2b5
3
  size 2271064456
model_head.pkl CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:8f26505603a392dfebb8ca914a16aa7e94aeeb8b35f89376e80a56616a8b08a4
3
  size 9087
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:96d22b0c74de93b5a70d706bf42366826ce9a80c2d3a555a2fadaed9e3d0c5e3
3
  size 9087
sentencepiece.bpe.model ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:cfc8146abe2a0488e9e2a0c56de7952f7c11ab059eca145a0a727afce0db2865
3
+ size 5069051