Daniel Korat commited on
Commit
cb65693
1 Parent(s): 32f70b7

Push model using huggingface_hub. (#1)

Browse files

- Push model using huggingface_hub. (f08ffcebd2581d7269909fd4682c0df38dc950b5)

1_Pooling/config.json CHANGED
@@ -3,5 +3,7 @@
3
  "pooling_mode_cls_token": true,
4
  "pooling_mode_mean_tokens": false,
5
  "pooling_mode_max_tokens": false,
6
- "pooling_mode_mean_sqrt_len_tokens": false
 
 
7
  }
 
3
  "pooling_mode_cls_token": true,
4
  "pooling_mode_mean_tokens": false,
5
  "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false
9
  }
README.md CHANGED
@@ -1,81 +1,206 @@
1
  ---
2
- license: apache-2.0
3
  tags:
4
  - setfit
5
  - sentence-transformers
6
  - text-classification
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7
  pipeline_tag: text-classification
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
  ---
9
 
10
- # moshew/bge-small-en-v1.5_setfit-sst2-english
11
 
12
- This is a [SetFit model](https://github.com/huggingface/setfit) that can be used for text classification. The model has been trained using an efficient few-shot learning technique that involves:
13
 
14
- 1. Fine-tuning a [Sentence Transformer](https://www.sbert.net) ("BAAI/bge-small-en-v1.5") with contrastive learning.
 
 
15
  2. Training a classification head with features from the fine-tuned Sentence Transformer.
16
 
17
- ## Training code
18
 
19
- ```python
20
- from setfit import SetFitModel
 
 
 
 
 
 
 
21
 
22
- from datasets import load_dataset
23
- from setfit import SetFitModel, SetFitTrainer
24
 
25
- # Load a dataset from the Hugging Face Hub
26
- dataset = load_dataset("SetFit/sst2")
 
27
 
28
- # Upload Train and Test data
29
- num_classes = 2
30
- test_ds = dataset["test"]
31
- train_ds = dataset["train"]
 
32
 
33
- model = SetFitModel.from_pretrained("BAAI/bge-small-en-v1.5")
34
- trainer = SetFitTrainer(model=model, train_dataset=train_ds, eval_dataset=test_ds)
35
 
36
- # Train and evaluate
37
- trainer.train()
38
- trainer.evaluate()['accuracy']
 
39
 
40
- ```
41
 
42
- ## Usage
43
 
44
- To use this model for inference, first install the SetFit library:
45
 
46
  ```bash
47
- python -m pip install setfit
48
  ```
49
 
50
- You can then run inference as follows:
51
 
52
  ```python
53
  from setfit import SetFitModel
54
 
55
- # Download from Hub and run inference
56
- model = SetFitModel.from_pretrained("moshew/bge-small-en-v1.5_setfit-sst2-english")
57
  # Run inference
58
- preds = model(["i loved the spiderman movie!", "pineapple on pizza is the worst 🤮"])
59
  ```
60
 
61
- ## Accuracy
62
- On SST-2 dev set:
63
-
64
- 91.4% SetFit
65
-
66
- 88.4% (no Fine-Tuning)
67
-
68
- ## BibTeX entry and citation info
69
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
70
  ```bibtex
71
  @article{https://doi.org/10.48550/arxiv.2209.11055,
72
- doi = {10.48550/ARXIV.2209.11055},
73
- url = {https://arxiv.org/abs/2209.11055},
74
- author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
75
- keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
76
- title = {Efficient Few-Shot Learning Without Prompts},
77
- publisher = {arXiv},
78
- year = {2022},
79
- copyright = {Creative Commons Attribution 4.0 International}
80
  }
81
  ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ library_name: setfit
3
  tags:
4
  - setfit
5
  - sentence-transformers
6
  - text-classification
7
+ - generated_from_setfit_trainer
8
+ datasets:
9
+ - SetFit/sst2
10
+ metrics:
11
+ - accuracy
12
+ widget:
13
+ - text: a noble failure .
14
+ - text: ms. seigner and mr. serrault bring fresh , unforced naturalism to their characters
15
+ .
16
+ - text: 'nothing can detract from the affection of that moral favorite : friends will
17
+ be friends through thick and thin .'
18
+ - text: confuses its message with an ultimate desire to please , and contorting itself
19
+ into an idea of expectation is the last thing any of these three actresses , nor
20
+ their characters , deserve .
21
+ - text: despite its promising cast of characters , big trouble remains a loosely tied
22
+ series of vignettes which only prove that ` zany ' does n't necessarily mean `
23
+ funny . '
24
  pipeline_tag: text-classification
25
+ inference: true
26
+ base_model: BAAI/bge-small-en-v1.5
27
+ model-index:
28
+ - name: SetFit with BAAI/bge-small-en-v1.5
29
+ results:
30
+ - task:
31
+ type: text-classification
32
+ name: Text Classification
33
+ dataset:
34
+ name: SetFit/sst2
35
+ type: SetFit/sst2
36
+ split: test
37
+ metrics:
38
+ - type: accuracy
39
+ value: 0.8841743119266054
40
+ name: Accuracy
41
  ---
42
 
43
+ # SetFit with BAAI/bge-small-en-v1.5
44
 
45
+ This is a [SetFit](https://github.com/huggingface/setfit) model trained on the [SetFit/sst2](https://huggingface.co/datasets/SetFit/sst2) dataset that can be used for Text Classification. This SetFit model uses [BAAI/bge-small-en-v1.5](https://huggingface.co/BAAI/bge-small-en-v1.5) as the Sentence Transformer embedding model. A [SetFitHead](huggingface.co/docs/setfit/reference/main#setfit.SetFitHead) instance is used for classification.
46
 
47
+ The model has been trained using an efficient few-shot learning technique that involves:
48
+
49
+ 1. Fine-tuning a [Sentence Transformer](https://www.sbert.net) with contrastive learning.
50
  2. Training a classification head with features from the fine-tuned Sentence Transformer.
51
 
52
+ ## Model Details
53
 
54
+ ### Model Description
55
+ - **Model Type:** SetFit
56
+ - **Sentence Transformer body:** [BAAI/bge-small-en-v1.5](https://huggingface.co/BAAI/bge-small-en-v1.5)
57
+ - **Classification head:** a [SetFitHead](huggingface.co/docs/setfit/reference/main#setfit.SetFitHead) instance
58
+ - **Maximum Sequence Length:** 512 tokens
59
+ - **Number of Classes:** 2 classes
60
+ - **Training Dataset:** [SetFit/sst2](https://huggingface.co/datasets/SetFit/sst2)
61
+ <!-- - **Language:** Unknown -->
62
+ <!-- - **License:** Unknown -->
63
 
64
+ ### Model Sources
 
65
 
66
+ - **Repository:** [SetFit on GitHub](https://github.com/huggingface/setfit)
67
+ - **Paper:** [Efficient Few-Shot Learning Without Prompts](https://arxiv.org/abs/2209.11055)
68
+ - **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit)
69
 
70
+ ### Model Labels
71
+ | Label | Examples |
72
+ |:------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
73
+ | 1 | <ul><li>'a stirring , funny and finally transporting re-imagining of beauty and the beast and 1930s horror films'</li><li>'this is a visually stunning rumination on love , memory , history and the war between art and commerce .'</li><li>"jonathan parker 's bartleby should have been the be-all-end-all of the modern-office anomie films ."</li></ul> |
74
+ | 0 | <ul><li>'apparently reassembled from the cutting-room floor of any given daytime soap .'</li><li>"they presume their audience wo n't sit still for a sociology lesson , however entertainingly presented , so they trot out the conventional science-fiction elements of bug-eyed monsters and futuristic women in skimpy clothes ."</li><li>'a fan film that for the uninitiated plays better on video with the sound turned down .'</li></ul> |
75
 
76
+ ## Evaluation
 
77
 
78
+ ### Metrics
79
+ | Label | Accuracy |
80
+ |:--------|:---------|
81
+ | **all** | 0.8842 |
82
 
83
+ ## Uses
84
 
85
+ ### Direct Use for Inference
86
 
87
+ First install the SetFit library:
88
 
89
  ```bash
90
+ pip install setfit
91
  ```
92
 
93
+ Then you can load this model and run inference.
94
 
95
  ```python
96
  from setfit import SetFitModel
97
 
98
+ # Download from the 🤗 Hub
99
+ model = SetFitModel.from_pretrained("dkorat/bge-small-en-v1.5_setfit-sst2-english")
100
  # Run inference
101
+ preds = model("a noble failure .")
102
  ```
103
 
104
+ <!--
105
+ ### Downstream Use
106
+
107
+ *List how someone could finetune this model on their own dataset.*
108
+ -->
109
+
110
+ <!--
111
+ ### Out-of-Scope Use
112
+
113
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
114
+ -->
115
+
116
+ <!--
117
+ ## Bias, Risks and Limitations
118
+
119
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
120
+ -->
121
+
122
+ <!--
123
+ ### Recommendations
124
+
125
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
126
+ -->
127
+
128
+ ## Training Details
129
+
130
+ ### Training Set Metrics
131
+ | Training set | Min | Median | Max |
132
+ |:-------------|:----|:-------|:----|
133
+ | Word count | 2 | 19.591 | 46 |
134
+
135
+ | Label | Training Sample Count |
136
+ |:------|:----------------------|
137
+ | 0 | 479 |
138
+ | 1 | 521 |
139
+
140
+ ### Training Hyperparameters
141
+ - batch_size: (16, 2)
142
+ - num_epochs: (1, 1)
143
+ - max_steps: -1
144
+ - sampling_strategy: oversampling
145
+ - num_iterations: 1
146
+ - body_learning_rate: (2e-05, 1e-05)
147
+ - head_learning_rate: 0.01
148
+ - loss: CosineSimilarityLoss
149
+ - distance_metric: cosine_distance
150
+ - margin: 0.25
151
+ - end_to_end: False
152
+ - use_amp: False
153
+ - warmup_proportion: 0.1
154
+ - seed: 42
155
+ - eval_max_steps: -1
156
+ - load_best_model_at_end: False
157
+
158
+ ### Training Results
159
+ | Epoch | Step | Training Loss | Validation Loss |
160
+ |:-----:|:----:|:-------------:|:---------------:|
161
+ | 0.008 | 1 | 0.241 | - |
162
+ | 0.4 | 50 | 0.2525 | - |
163
+ | 0.8 | 100 | 0.0607 | - |
164
+
165
+ ### Framework Versions
166
+ - Python: 3.10.13
167
+ - SetFit: 1.0.3
168
+ - Sentence Transformers: 2.3.0
169
+ - Transformers: 4.37.2
170
+ - PyTorch: 2.1.2+cu121
171
+ - Datasets: 2.16.1
172
+ - Tokenizers: 0.15.1
173
+
174
+ ## Citation
175
+
176
+ ### BibTeX
177
  ```bibtex
178
  @article{https://doi.org/10.48550/arxiv.2209.11055,
179
+ doi = {10.48550/ARXIV.2209.11055},
180
+ url = {https://arxiv.org/abs/2209.11055},
181
+ author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
182
+ keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
183
+ title = {Efficient Few-Shot Learning Without Prompts},
184
+ publisher = {arXiv},
185
+ year = {2022},
186
+ copyright = {Creative Commons Attribution 4.0 International}
187
  }
188
  ```
189
+
190
+ <!--
191
+ ## Glossary
192
+
193
+ *Clearly define terms in order to be accessible across audiences.*
194
+ -->
195
+
196
+ <!--
197
+ ## Model Card Authors
198
+
199
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
200
+ -->
201
+
202
+ <!--
203
+ ## Model Card Contact
204
+
205
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
206
+ -->
config.json CHANGED
@@ -1,5 +1,5 @@
1
  {
2
- "_name_or_path": "/root/.cache/torch/sentence_transformers/BAAI_bge-small-en-v1.5/",
3
  "architectures": [
4
  "BertModel"
5
  ],
@@ -24,7 +24,7 @@
24
  "pad_token_id": 0,
25
  "position_embedding_type": "absolute",
26
  "torch_dtype": "float32",
27
- "transformers_version": "4.34.1",
28
  "type_vocab_size": 2,
29
  "use_cache": true,
30
  "vocab_size": 30522
 
1
  {
2
+ "_name_or_path": "BAAI/bge-small-en-v1.5",
3
  "architectures": [
4
  "BertModel"
5
  ],
 
24
  "pad_token_id": 0,
25
  "position_embedding_type": "absolute",
26
  "torch_dtype": "float32",
27
+ "transformers_version": "4.37.2",
28
  "type_vocab_size": 2,
29
  "use_cache": true,
30
  "vocab_size": 30522
config_setfit.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "normalize_embeddings": false,
3
+ "labels": null
4
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d0b24ae91024bae35a46a0ebcab9978cc3ef409d4d2ccf54acbfa7c666f76603
3
+ size 133462128
model_head.pkl CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:7b1e14c8a669c5bb0991af076e79447939612d94ff7300208f3278d656fda0fc
3
- size 3919
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4068de7435b164f0b4153e1792c925ff7105d0ca1747c5679d2f3116173be49d
3
+ size 4585
special_tokens_map.json CHANGED
@@ -1,7 +1,37 @@
1
  {
2
- "cls_token": "[CLS]",
3
- "mask_token": "[MASK]",
4
- "pad_token": "[PAD]",
5
- "sep_token": "[SEP]",
6
- "unk_token": "[UNK]"
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7
  }
 
1
  {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
  }
tokenizer_config.json CHANGED
@@ -46,7 +46,7 @@
46
  "do_basic_tokenize": true,
47
  "do_lower_case": true,
48
  "mask_token": "[MASK]",
49
- "model_max_length": 1000000000000000019884624838656,
50
  "never_split": null,
51
  "pad_token": "[PAD]",
52
  "sep_token": "[SEP]",
 
46
  "do_basic_tokenize": true,
47
  "do_lower_case": true,
48
  "mask_token": "[MASK]",
49
+ "model_max_length": 512,
50
  "never_split": null,
51
  "pad_token": "[PAD]",
52
  "sep_token": "[SEP]",