Add SetFit model

Browse files

Files changed (10) hide show

1_Pooling/config.json +1 -1
README.md +25 -34
config.json +10 -20
config_sentence_transformers.json +3 -3
model.safetensors +2 -2
model_head.pkl +2 -2
sentence_bert_config.json +1 -1
tokenizer.json +2 -2
tokenizer_config.json +8 -1
vocab.txt +0 -6

1_Pooling/config.json CHANGED Viewed

@@ -1,5 +1,5 @@
 {
-  "word_embedding_dimension": 512,
   "pooling_mode_cls_token": false,
   "pooling_mode_mean_tokens": true,
   "pooling_mode_max_tokens": false,

 {
+  "word_embedding_dimension": 384,
   "pooling_mode_cls_token": false,
   "pooling_mode_mean_tokens": true,
   "pooling_mode_max_tokens": false,

README.md CHANGED Viewed

@@ -24,9 +24,9 @@ widget:
     ground-based reference data.
 pipeline_tag: text-classification
 inference: true
-base_model: jinaai/jina-embeddings-v2-small-en
 model-index:
-- name: SetFit with jinaai/jina-embeddings-v2-small-en
   results:
   - task:
       type: text-classification
@@ -37,13 +37,13 @@ model-index:
       split: test
     metrics:
     - type: accuracy
-      value: 0.8492307692307692
       name: Accuracy
 ---
-# SetFit with jinaai/jina-embeddings-v2-small-en
-This is a [SetFit](https://github.com/huggingface/setfit) model that can be used for Text Classification. This SetFit model uses [jinaai/jina-embeddings-v2-small-en](https://huggingface.co/jinaai/jina-embeddings-v2-small-en) as the Sentence Transformer embedding model. A [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance is used for classification.
 The model has been trained using an efficient few-shot learning technique that involves:
@@ -54,9 +54,9 @@ The model has been trained using an efficient few-shot learning technique that i
 ### Model Description
 - **Model Type:** SetFit
-- **Sentence Transformer body:** [jinaai/jina-embeddings-v2-small-en](https://huggingface.co/jinaai/jina-embeddings-v2-small-en)
 - **Classification head:** a [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance
-- **Maximum Sequence Length:** 8192 tokens
 - **Number of Classes:** 13 classes
 <!-- - **Training Dataset:** [Unknown](https://huggingface.co/datasets/unknown) -->
 <!-- - **Language:** Unknown -->
@@ -90,7 +90,7 @@ The model has been trained using an efficient few-shot learning technique that i
 ### Metrics
 | Label   | Accuracy |
 |:--------|:---------|
-| **all** | 0.8492   |
 ## Uses
@@ -181,32 +181,23 @@ preds = model("This paper focuses on mining association rules between sets of it
 ### Training Results
 | Epoch  | Step | Training Loss | Validation Loss |
 |:------:|:----:|:-------------:|:---------------:|
-| 0.0025 | 1    | 0.2913        | -               |
-| 0.1229 | 50   | 0.2365        | -               |
-| 0.2457 | 100  | 0.185         | -               |
-| 0.3686 | 150  | 0.159         | -               |
-| 0.4914 | 200  | 0.1456        | -               |
-| 0.6143 | 250  | 0.1658        | -               |
-| 0.7371 | 300  | 0.1189        | -               |
-| 0.8600 | 350  | 0.1235        | -               |
-| 0.9828 | 400  | 0.1282        | -               |
-| 0.0049 | 1    | 0.1257        | -               |
-| 0.0615 | 50   | 0.1371        | -               |
-| 0.1230 | 100  | 0.1226        | -               |
-| 0.1845 | 150  | 0.1099        | -               |
-| 0.2460 | 200  | 0.0897        | -               |
-| 0.3075 | 250  | 0.1009        | -               |
-| 0.3690 | 300  | 0.0659        | -               |
-| 0.4305 | 350  | 0.0711        | -               |
-| 0.4920 | 400  | 0.0745        | -               |
-| 0.5535 | 450  | 0.0807        | -               |
-| 0.6150 | 500  | 0.0736        | -               |
-| 0.6765 | 550  | 0.0571        | -               |
-| 0.7380 | 600  | 0.0649        | -               |
-| 0.7995 | 650  | 0.0672        | -               |
-| 0.8610 | 700  | 0.0586        | -               |
-| 0.9225 | 750  | 0.0624        | -               |
-| 0.9840 | 800  | 0.0614        | -               |
 ### Framework Versions
 - Python: 3.10.12

     ground-based reference data.
 pipeline_tag: text-classification
 inference: true
+base_model: sentence-transformers/paraphrase-MiniLM-L3-v2
 model-index:
+- name: SetFit with sentence-transformers/paraphrase-MiniLM-L3-v2
   results:
   - task:
       type: text-classification
       split: test
     metrics:
     - type: accuracy
+      value: 0.7407692307692307
       name: Accuracy
 ---
+# SetFit with sentence-transformers/paraphrase-MiniLM-L3-v2
+This is a [SetFit](https://github.com/huggingface/setfit) model that can be used for Text Classification. This SetFit model uses [sentence-transformers/paraphrase-MiniLM-L3-v2](https://huggingface.co/sentence-transformers/paraphrase-MiniLM-L3-v2) as the Sentence Transformer embedding model. A [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance is used for classification.
 The model has been trained using an efficient few-shot learning technique that involves:
 ### Model Description
 - **Model Type:** SetFit
+- **Sentence Transformer body:** [sentence-transformers/paraphrase-MiniLM-L3-v2](https://huggingface.co/sentence-transformers/paraphrase-MiniLM-L3-v2)
 - **Classification head:** a [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance
+- **Maximum Sequence Length:** 128 tokens
 - **Number of Classes:** 13 classes
 <!-- - **Training Dataset:** [Unknown](https://huggingface.co/datasets/unknown) -->
 <!-- - **Language:** Unknown -->
 ### Metrics
 | Label   | Accuracy |
 |:--------|:---------|
+| **all** | 0.7408   |
 ## Uses
 ### Training Results
 | Epoch  | Step | Training Loss | Validation Loss |
 |:------:|:----:|:-------------:|:---------------:|
+| 0.0012 | 1    | 0.4201        | -               |
+| 0.0615 | 50   | 0.2562        | -               |
+| 0.1230 | 100  | 0.2334        | -               |
+| 0.1845 | 150  | 0.1974        | -               |
+| 0.2460 | 200  | 0.195         | -               |
+| 0.3075 | 250  | 0.1768        | -               |
+| 0.3690 | 300  | 0.146         | -               |
+| 0.4305 | 350  | 0.1541        | -               |
+| 0.4920 | 400  | 0.1647        | -               |
+| 0.5535 | 450  | 0.154         | -               |
+| 0.6150 | 500  | 0.1568        | -               |
+| 0.6765 | 550  | 0.1494        | -               |
+| 0.7380 | 600  | 0.1554        | -               |
+| 0.7995 | 650  | 0.1456        | -               |
+| 0.8610 | 700  | 0.1527        | -               |
+| 0.9225 | 750  | 0.1488        | -               |
+| 0.9840 | 800  | 0.1312        | -               |
 ### Framework Versions
 - Python: 3.10.12

config.json CHANGED Viewed

@@ -1,36 +1,26 @@
 {
-  "_name_or_path": "/root/.cache/torch/sentence_transformers/jinaai_jina-embeddings-v2-small-en/",
   "architectures": [
-    "JinaBertModel"
   ],
-  "attention_probs_dropout_prob": 0.0,
-  "attn_implementation": null,
-  "auto_map": {
-    "AutoConfig": "configuration_bert.JinaBertConfig",
-    "AutoModel": "modeling_bert.JinaBertModel",
-    "AutoModelForMaskedLM": "jinaai/jina-bert-implementation--modeling_bert.JinaBertForMaskedLM",
-    "AutoModelForSequenceClassification": "jinaai/jina-bert-implementation--modeling_bert.JinaBertForSequenceClassification"
-  },
   "classifier_dropout": null,
-  "emb_pooler": "mean",
-  "feed_forward_type": "geglu",
   "gradient_checkpointing": false,
   "hidden_act": "gelu",
   "hidden_dropout_prob": 0.1,
-  "hidden_size": 512,
   "initializer_range": 0.02,
-  "intermediate_size": 2048,
   "layer_norm_eps": 1e-12,
-  "max_position_embeddings": 8192,
-  "model_max_length": 8192,
   "model_type": "bert",
-  "num_attention_heads": 8,
-  "num_hidden_layers": 4,
   "pad_token_id": 0,
-  "position_embedding_type": "alibi",
   "torch_dtype": "float32",
   "transformers_version": "4.36.2",
   "type_vocab_size": 2,
   "use_cache": true,
-  "vocab_size": 30528
 }

 {
+  "_name_or_path": "/root/.cache/torch/sentence_transformers/sentence-transformers_paraphrase-MiniLM-L3-v2/",
   "architectures": [
+    "BertModel"
   ],
+  "attention_probs_dropout_prob": 0.1,
   "classifier_dropout": null,
   "gradient_checkpointing": false,
   "hidden_act": "gelu",
   "hidden_dropout_prob": 0.1,
+  "hidden_size": 384,
   "initializer_range": 0.02,
+  "intermediate_size": 1536,
   "layer_norm_eps": 1e-12,
+  "max_position_embeddings": 512,
   "model_type": "bert",
+  "num_attention_heads": 12,
+  "num_hidden_layers": 3,
   "pad_token_id": 0,
+  "position_embedding_type": "absolute",
   "torch_dtype": "float32",
   "transformers_version": "4.36.2",
   "type_vocab_size": 2,
   "use_cache": true,
+  "vocab_size": 30522
 }

config_sentence_transformers.json CHANGED Viewed

@@ -1,7 +1,7 @@
 {
   "__version__": {
-    "sentence_transformers": "2.2.2",
-    "transformers": "4.31.0",
-    "pytorch": "2.0.1"
   }
 }

 {
   "__version__": {
+    "sentence_transformers": "2.0.0",
+    "transformers": "4.7.0",
+    "pytorch": "1.9.0+cu102"
   }
 }

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:0c1529f97f7d63f60cb1caf5043049c5be4b244a452b7596283781b007c81a7b
-size 130769960

 version https://git-lfs.github.com/spec/v1
+oid sha256:782421e8a8f86650f5c4c24184bb8cde66eb095e4f2bce737ad3508d1c844bd8
+size 69565312

model_head.pkl CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:f033529ac4cb485d155c9a7f4466798a188186c7c1c599327842835c47c9c7a3
-size 54959

 version https://git-lfs.github.com/spec/v1
+oid sha256:6790c7fffe6c2ab476607806d7b8ab06f8b147b2dce5a6a6eba84ea624ba05b8
+size 41647

sentence_bert_config.json CHANGED Viewed

@@ -1,4 +1,4 @@
 {
-  "max_seq_length": 8192,
   "do_lower_case": false
 }

 {
+  "max_seq_length": 128,
   "do_lower_case": false
 }

tokenizer.json CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:b36ee0ed6d20d181de65ce729dea6169658a9cfa4dede6717ba9fa2e4fbd3bc7
-size 711827

 version https://git-lfs.github.com/spec/v1
+oid sha256:2fc687b11de0bc1b3d8348f92e3b49ef1089a621506c7661fbf3248fcd54947e
+size 711649

tokenizer_config.json CHANGED Viewed

@@ -46,12 +46,19 @@
   "do_basic_tokenize": true,
   "do_lower_case": true,
   "mask_token": "[MASK]",
-  "model_max_length": 2147483648,
   "never_split": null,
   "pad_token": "[PAD]",
   "sep_token": "[SEP]",
   "strip_accents": null,
   "tokenize_chinese_chars": true,
   "tokenizer_class": "BertTokenizer",
   "unk_token": "[UNK]"
 }

   "do_basic_tokenize": true,
   "do_lower_case": true,
   "mask_token": "[MASK]",
+  "max_length": 128,
+  "model_max_length": 512,
   "never_split": null,
+  "pad_to_multiple_of": null,
   "pad_token": "[PAD]",
+  "pad_token_type_id": 0,
+  "padding_side": "right",
   "sep_token": "[SEP]",
+  "stride": 0,
   "strip_accents": null,
   "tokenize_chinese_chars": true,
   "tokenizer_class": "BertTokenizer",
+  "truncation_side": "right",
+  "truncation_strategy": "longest_first",
   "unk_token": "[UNK]"
 }

vocab.txt CHANGED Viewed

@@ -30520,9 +30520,3 @@ necessitated
 ##：
 ##？
 ##～
-bowang
-georgiosmastrapas
-jackminong
-jonathangeuter
-louismilliken
-michaelguenther

 ##：
 ##？
 ##～