tomaarsen HF staff commited on
Commit
bfba1ee
1 Parent(s): 048219b

Add SetFit ABSA model

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false
7
+ }
README.md ADDED
@@ -0,0 +1,260 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: setfit
3
+ tags:
4
+ - setfit
5
+ - absa
6
+ - sentence-transformers
7
+ - text-classification
8
+ - generated_from_setfit_trainer
9
+ metrics:
10
+ - accuracy
11
+ widget:
12
+ - text: bar:After really enjoying ourselves at the bar we sat down at a table and
13
+ had dinner.
14
+ - text: interior decor:this little place has a cute interior decor and affordable
15
+ city prices.
16
+ - text: cuisine:The cuisine from what I've gathered is authentic Taiwanese, though
17
+ its very different from what I've been accustomed to in Taipei.
18
+ - text: dining:Go here for a romantic dinner but not for an all out wow dining experience.
19
+ - text: Taipei:The cuisine from what I've gathered is authentic Taiwanese, though
20
+ its very different from what I've been accustomed to in Taipei.
21
+ pipeline_tag: text-classification
22
+ inference: false
23
+ co2_eq_emissions:
24
+ emissions: 8.62132655272333
25
+ source: codecarbon
26
+ training_type: fine-tuning
27
+ on_cloud: false
28
+ cpu_model: 13th Gen Intel(R) Core(TM) i7-13700K
29
+ ram_total_size: 31.777088165283203
30
+ hours_used: 0.111
31
+ hardware_used: 1 x NVIDIA GeForce RTX 3090
32
+ base_model: sentence-transformers/paraphrase-mpnet-base-v2
33
+ model-index:
34
+ - name: SetFit Aspect Model with sentence-transformers/paraphrase-mpnet-base-v2
35
+ results:
36
+ - task:
37
+ type: text-classification
38
+ name: Text Classification
39
+ dataset:
40
+ name: Unknown
41
+ type: unknown
42
+ split: test
43
+ metrics:
44
+ - type: accuracy
45
+ value: 0.8779507785032646
46
+ name: Accuracy
47
+ ---
48
+
49
+ # SetFit Aspect Model with sentence-transformers/paraphrase-mpnet-base-v2
50
+
51
+ This is a [SetFit](https://github.com/huggingface/setfit) model that can be used for Aspect Based Sentiment Analysis (ABSA). This SetFit model uses [sentence-transformers/paraphrase-mpnet-base-v2](https://huggingface.co/sentence-transformers/paraphrase-mpnet-base-v2) as the Sentence Transformer embedding model. A [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance is used for classification. In particular, this model is in charge of filtering aspect span candidates.
52
+
53
+ The model has been trained using an efficient few-shot learning technique that involves:
54
+
55
+ 1. Fine-tuning a [Sentence Transformer](https://www.sbert.net) with contrastive learning.
56
+ 2. Training a classification head with features from the fine-tuned Sentence Transformer.
57
+
58
+ This model was trained within the context of a larger system for ABSA, which looks like so:
59
+
60
+ 1. Use a spaCy model to select possible aspect span candidates.
61
+ 2. **Use this SetFit model to filter these possible aspect span candidates.**
62
+ 3. Use a SetFit model to classify the filtered aspect span candidates.
63
+
64
+ ## Model Details
65
+
66
+ ### Model Description
67
+ - **Model Type:** SetFit
68
+ - **Sentence Transformer body:** [sentence-transformers/paraphrase-mpnet-base-v2](https://huggingface.co/sentence-transformers/paraphrase-mpnet-base-v2)
69
+ - **Classification head:** a [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance
70
+ - **spaCy Model:** en_core_web_lg
71
+ - **SetFitABSA Aspect Model:** [tomaarsen/setfit-absa-paraphrase-mpnet-base-v2-restaurants-aspect](https://huggingface.co/tomaarsen/setfit-absa-paraphrase-mpnet-base-v2-restaurants-aspect)
72
+ - **SetFitABSA Polarity Model:** [tomaarsen/setfit-absa-paraphrase-mpnet-base-v2-restaurants-polarity](https://huggingface.co/tomaarsen/setfit-absa-paraphrase-mpnet-base-v2-restaurants-polarity)
73
+ - **Maximum Sequence Length:** 512 tokens
74
+ - **Number of Classes:** 2 classes
75
+ <!-- - **Training Dataset:** [Unknown](https://huggingface.co/datasets/unknown) -->
76
+ <!-- - **Language:** Unknown -->
77
+ <!-- - **License:** Unknown -->
78
+
79
+ ### Model Sources
80
+
81
+ - **Repository:** [SetFit on GitHub](https://github.com/huggingface/setfit)
82
+ - **Paper:** [Efficient Few-Shot Learning Without Prompts](https://arxiv.org/abs/2209.11055)
83
+ - **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit)
84
+
85
+ ### Model Labels
86
+ | Label | Examples |
87
+ |:----------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
88
+ | aspect | <ul><li>'staff:But the staff was so horrible to us.'</li><li>"food:To be completely fair, the only redeeming factor was the food, which was above average, but couldn't make up for all the other deficiencies of Teodora."</li><li>"food:The food is uniformly exceptional, with a very capable kitchen which will proudly whip up whatever you feel like eating, whether it's on the menu or not."</li></ul> |
89
+ | no aspect | <ul><li>"factor:To be completely fair, the only redeeming factor was the food, which was above average, but couldn't make up for all the other deficiencies of Teodora."</li><li>"deficiencies:To be completely fair, the only redeeming factor was the food, which was above average, but couldn't make up for all the other deficiencies of Teodora."</li><li>"Teodora:To be completely fair, the only redeeming factor was the food, which was above average, but couldn't make up for all the other deficiencies of Teodora."</li></ul> |
90
+
91
+ ## Evaluation
92
+
93
+ ### Metrics
94
+ | Label | Accuracy |
95
+ |:--------|:---------|
96
+ | **all** | 0.8780 |
97
+
98
+ ## Uses
99
+
100
+ ### Direct Use for Inference
101
+
102
+ First install the SetFit library:
103
+
104
+ ```bash
105
+ pip install setfit
106
+ ```
107
+
108
+ Then you can load this model and run inference.
109
+
110
+ ```python
111
+ from setfit import AbsaModel
112
+
113
+ # Download from the 🤗 Hub
114
+ model = AbsaModel.from_pretrained(
115
+ "tomaarsen/setfit-absa-paraphrase-mpnet-base-v2-restaurants-aspect",
116
+ "tomaarsen/setfit-absa-paraphrase-mpnet-base-v2-restaurants-polarity",
117
+ )
118
+ # Run inference
119
+ preds = model("The food was great, but the venue is just way too busy.")
120
+ ```
121
+
122
+ <!--
123
+ ### Downstream Use
124
+
125
+ *List how someone could finetune this model on their own dataset.*
126
+ -->
127
+
128
+ <!--
129
+ ### Out-of-Scope Use
130
+
131
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
132
+ -->
133
+
134
+ <!--
135
+ ## Bias, Risks and Limitations
136
+
137
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
138
+ -->
139
+
140
+ <!--
141
+ ### Recommendations
142
+
143
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
144
+ -->
145
+
146
+ ## Training Details
147
+
148
+ ### Training Set Metrics
149
+ | Training set | Min | Median | Max |
150
+ |:-------------|:----|:--------|:----|
151
+ | Word count | 4 | 17.9296 | 37 |
152
+
153
+ | Label | Training Sample Count |
154
+ |:----------|:----------------------|
155
+ | no aspect | 71 |
156
+ | aspect | 128 |
157
+
158
+ ### Training Hyperparameters
159
+ - batch_size: (16, 2)
160
+ - num_epochs: (1, 16)
161
+ - max_steps: -1
162
+ - sampling_strategy: oversampling
163
+ - body_learning_rate: (2e-05, 1e-05)
164
+ - head_learning_rate: 0.01
165
+ - loss: CosineSimilarityLoss
166
+ - distance_metric: cosine_distance
167
+ - margin: 0.25
168
+ - end_to_end: False
169
+ - use_amp: False
170
+ - warmup_proportion: 0.1
171
+ - seed: 42
172
+ - eval_max_steps: -1
173
+ - load_best_model_at_end: False
174
+
175
+ ### Training Results
176
+ | Epoch | Step | Training Loss | Validation Loss |
177
+ |:------:|:----:|:-------------:|:---------------:|
178
+ | 0.0007 | 1 | 0.3388 | - |
179
+ | 0.0370 | 50 | 0.2649 | - |
180
+ | 0.0740 | 100 | 0.1562 | - |
181
+ | 0.1109 | 150 | 0.1072 | - |
182
+ | 0.1479 | 200 | 0.0021 | - |
183
+ | 0.1849 | 250 | 0.0007 | - |
184
+ | 0.2219 | 300 | 0.0008 | - |
185
+ | 0.2589 | 350 | 0.0003 | - |
186
+ | 0.2959 | 400 | 0.0002 | - |
187
+ | 0.3328 | 450 | 0.0003 | - |
188
+ | 0.3698 | 500 | 0.0002 | - |
189
+ | 0.4068 | 550 | 0.0001 | - |
190
+ | 0.4438 | 600 | 0.0001 | - |
191
+ | 0.4808 | 650 | 0.0001 | - |
192
+ | 0.5178 | 700 | 0.0001 | - |
193
+ | 0.5547 | 750 | 0.0001 | - |
194
+ | 0.5917 | 800 | 0.0001 | - |
195
+ | 0.6287 | 850 | 0.0002 | - |
196
+ | 0.6657 | 900 | 0.0001 | - |
197
+ | 0.7027 | 950 | 0.0001 | - |
198
+ | 0.7396 | 1000 | 0.0001 | - |
199
+ | 0.7766 | 1050 | 0.0001 | - |
200
+ | 0.8136 | 1100 | 0.0001 | - |
201
+ | 0.8506 | 1150 | 0.0001 | - |
202
+ | 0.8876 | 1200 | 0.0001 | - |
203
+ | 0.9246 | 1250 | 0.0001 | - |
204
+ | 0.9615 | 1300 | 0.0001 | - |
205
+ | 0.9985 | 1350 | 0.0 | - |
206
+
207
+ ### Environmental Impact
208
+ Carbon emissions were measured using [CodeCarbon](https://github.com/mlco2/codecarbon).
209
+ - **Carbon Emitted**: 0.009 kg of CO2
210
+ - **Hours Used**: 0.111 hours
211
+
212
+ ### Training Hardware
213
+ - **On Cloud**: No
214
+ - **GPU Model**: 1 x NVIDIA GeForce RTX 3090
215
+ - **CPU Model**: 13th Gen Intel(R) Core(TM) i7-13700K
216
+ - **RAM Size**: 31.78 GB
217
+
218
+ ### Framework Versions
219
+ - Python: 3.9.16
220
+ - SetFit: 1.0.0.dev0
221
+ - Sentence Transformers: 2.2.2
222
+ - spaCy: 3.7.2
223
+ - Transformers: 4.29.0
224
+ - PyTorch: 1.13.1+cu117
225
+ - Datasets: 2.15.0
226
+ - Tokenizers: 0.13.3
227
+
228
+ ## Citation
229
+
230
+ ### BibTeX
231
+ ```bibtex
232
+ @article{https://doi.org/10.48550/arxiv.2209.11055,
233
+ doi = {10.48550/ARXIV.2209.11055},
234
+ url = {https://arxiv.org/abs/2209.11055},
235
+ author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
236
+ keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
237
+ title = {Efficient Few-Shot Learning Without Prompts},
238
+ publisher = {arXiv},
239
+ year = {2022},
240
+ copyright = {Creative Commons Attribution 4.0 International}
241
+ }
242
+ ```
243
+
244
+ <!--
245
+ ## Glossary
246
+
247
+ *Clearly define terms in order to be accessible across audiences.*
248
+ -->
249
+
250
+ <!--
251
+ ## Model Card Authors
252
+
253
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
254
+ -->
255
+
256
+ <!--
257
+ ## Model Card Contact
258
+
259
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
260
+ -->
config.json ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "C:\\Users\\tom/.cache\\torch\\sentence_transformers\\sentence-transformers_paraphrase-mpnet-base-v2\\",
3
+ "architectures": [
4
+ "MPNetModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "bos_token_id": 0,
8
+ "eos_token_id": 2,
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 768,
12
+ "initializer_range": 0.02,
13
+ "intermediate_size": 3072,
14
+ "layer_norm_eps": 1e-05,
15
+ "max_position_embeddings": 514,
16
+ "model_type": "mpnet",
17
+ "num_attention_heads": 12,
18
+ "num_hidden_layers": 12,
19
+ "pad_token_id": 1,
20
+ "relative_attention_num_buckets": 32,
21
+ "torch_dtype": "float32",
22
+ "transformers_version": "4.29.0",
23
+ "vocab_size": 30527
24
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "2.0.0",
4
+ "transformers": "4.7.0",
5
+ "pytorch": "1.9.0+cu102"
6
+ }
7
+ }
config_setfit.json ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "normalize_embeddings": false,
3
+ "spacy_model": "en_core_web_lg",
4
+ "span_context": 0,
5
+ "labels": [
6
+ "no aspect",
7
+ "aspect"
8
+ ]
9
+ }
model_head.pkl ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6ead41e5c4e2c947c81f083d6e3cec7d1c1a9bb4f9d8b99fcaaf3a87370afa5d
3
+ size 6991
modules.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ }
14
+ ]
pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:09f797e4db48ed76f9111dc422c3a8156119230fd042932eb6d35edc5eef4bb7
3
+ size 438016493
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,15 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": "<s>",
3
+ "cls_token": "<s>",
4
+ "eos_token": "</s>",
5
+ "mask_token": {
6
+ "content": "<mask>",
7
+ "lstrip": true,
8
+ "normalized": false,
9
+ "rstrip": false,
10
+ "single_word": false
11
+ },
12
+ "pad_token": "<pad>",
13
+ "sep_token": "</s>",
14
+ "unk_token": "[UNK]"
15
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,66 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": {
3
+ "__type": "AddedToken",
4
+ "content": "<s>",
5
+ "lstrip": false,
6
+ "normalized": true,
7
+ "rstrip": false,
8
+ "single_word": false
9
+ },
10
+ "clean_up_tokenization_spaces": true,
11
+ "cls_token": {
12
+ "__type": "AddedToken",
13
+ "content": "<s>",
14
+ "lstrip": false,
15
+ "normalized": true,
16
+ "rstrip": false,
17
+ "single_word": false
18
+ },
19
+ "do_basic_tokenize": true,
20
+ "do_lower_case": true,
21
+ "eos_token": {
22
+ "__type": "AddedToken",
23
+ "content": "</s>",
24
+ "lstrip": false,
25
+ "normalized": true,
26
+ "rstrip": false,
27
+ "single_word": false
28
+ },
29
+ "mask_token": {
30
+ "__type": "AddedToken",
31
+ "content": "<mask>",
32
+ "lstrip": true,
33
+ "normalized": true,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ },
37
+ "model_max_length": 512,
38
+ "never_split": null,
39
+ "pad_token": {
40
+ "__type": "AddedToken",
41
+ "content": "<pad>",
42
+ "lstrip": false,
43
+ "normalized": true,
44
+ "rstrip": false,
45
+ "single_word": false
46
+ },
47
+ "sep_token": {
48
+ "__type": "AddedToken",
49
+ "content": "</s>",
50
+ "lstrip": false,
51
+ "normalized": true,
52
+ "rstrip": false,
53
+ "single_word": false
54
+ },
55
+ "strip_accents": null,
56
+ "tokenize_chinese_chars": true,
57
+ "tokenizer_class": "MPNetTokenizer",
58
+ "unk_token": {
59
+ "__type": "AddedToken",
60
+ "content": "[UNK]",
61
+ "lstrip": false,
62
+ "normalized": true,
63
+ "rstrip": false,
64
+ "single_word": false
65
+ }
66
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff