Add SetFit model

Browse files

Files changed (13) hide show

1_Pooling/config.json +10 -0
README.md +293 -0
config.json +32 -0
config_sentence_transformers.json +10 -0
config_setfit.json +4 -0
model.safetensors +3 -0
model_head.pkl +3 -0
modules.json +20 -0
sentence_bert_config.json +4 -0
special_tokens_map.json +37 -0
tokenizer.json +0 -0
tokenizer_config.json +57 -0
vocab.txt +0 -0

1_Pooling/config.json ADDED Viewed

	@@ -0,0 +1,10 @@

+{
+  "word_embedding_dimension": 768,
+  "pooling_mode_cls_token": true,
+  "pooling_mode_mean_tokens": false,
+  "pooling_mode_max_tokens": false,
+  "pooling_mode_mean_sqrt_len_tokens": false,
+  "pooling_mode_weightedmean_tokens": false,
+  "pooling_mode_lasttoken": false,
+  "include_prompt": true
+}

README.md ADDED Viewed

	@@ -0,0 +1,293 @@

+---
+base_model: BAAI/bge-base-en-v1.5
+library_name: setfit
+metrics:
+- accuracy
+pipeline_tag: text-classification
+tags:
+- setfit
+- sentence-transformers
+- text-classification
+- generated_from_setfit_trainer
+widget:
+- text: 'Reasoning:
+    1. Context Grounding: The provided answer reflects the key identifiers of a funnel
+    spider found in the document, such as the dark brown or black body, hard shiny
+    carapace, and large fangs.
+    2. Relevance: The answer directly addresses the question of how to identify a
+    funnel spider with relevant details.
+    3. Conciseness: The answer is clear, with pertinent details, and avoids unnecessary
+    information.
+    Final evaluation:'
+- text: 'Evaluation:
+    1. Context Grounding:
+    The answer is well-supported by the provided document. It correctly mentions creating
+    margins, using a 12-point font, double-spacing text, creating a running header,
+    typing the heading in the upper left corner, and starting the body of the paper
+    with left-aligned text.
+    2. Relevance:
+    The answer is relevant to the question asked and specifically addresses how to
+    write a paper in MLA format. It does not deviate into unrelated topics.
+    3. Conciseness:
+    The answer is clear and to the point, avoiding unnecessary information, and covers
+    the key steps required to write in MLA format.
+    Final Result:'
+- text: 'The answer provided is partially correct but lacks several important details
+    covered in the document. It highlights the importance of grades in specific subjects
+    and getting clinical experience, but misses other key steps such as involving
+    oneself in extracurricular activities, understanding application procedures, preparing
+    extensively for the MCAT, and engaging with advisors and peers.
+    **Reasoning:**
+    1. **Context Grounding:** The answer does cover some points mentioned in the document
+    such as focusing on grades in specific subjects (BCPM) and getting clinical experience.
+    However, it omits several other detailed recommendations provided in the document.
+    2. **Relevance:** The answer is somewhat relevant to the question but fails to
+    address the full scope of steps required to get into medical school as outlined
+    in the document.
+    3. **Conciseness:** While the answer is concise, it lacks the critical breadth
+    needed to be truly adequate and comprehensive for getting into medical school.
+    **Final Evaluation: **'
+- text: 'Evaluation:
+    1. Context Grounding: The answer is largely grounded in the provided document
+    but adds some minor details that were not explicitly mentioned, such as "putting
+    the clothes on top of you to cover your body."
+    2. Relevance: The answer is highly relevant to the question asked and focuses
+    on strategies and techniques for playing hide and seek, which is directly related
+    to the document''s content.
+    3. Conciseness: The answer is slightly long-winded but generally clear. It could
+    be more concise by removing extraneous details.
+    Final result:'
+- text: "**Evaluation Reasoning:**\n\n1. **Context Grounding:**\n   - **Weak Grounding:**\
+    \ The provided instructions for making a saline solution and administering it\
+    \ are not entirely accurate based on the document. The document specifies different\
+    \ proportions (1 cup water, 1/2 teaspoon salt, 1/2 teaspoon baking soda) and advises\
+    \ against overly inserting the bulb.\n  \n2. **Relevance:**\n   - **Partially\
+    \ Relevant:** The answer attempts to address the question about treating a baby's\
+    \ cough but contains inaccuracies and some deviations that make it only partially\
+    \ relevant.\n\n3. **Conciseness:**\n   - **Problems with Conciseness:** The answer\
+    \ includes extraneous details not needed to succinctly treat a baby’s cough, especially\
+    \ the inclusion of incorrect proportion information which adds confusion rather\
+    \ than clarity.\n\n**Final Evaluation: **"
+inference: true
+model-index:
+- name: SetFit with BAAI/bge-base-en-v1.5
+  results:
+  - task:
+      type: text-classification
+      name: Text Classification
+    dataset:
+      name: Unknown
+      type: unknown
+      split: test
+    metrics:
+    - type: accuracy
+      value: 0.8648648648648649
+      name: Accuracy
+---
+# SetFit with BAAI/bge-base-en-v1.5
+This is a [SetFit](https://github.com/huggingface/setfit) model that can be used for Text Classification. This SetFit model uses [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) as the Sentence Transformer embedding model. A [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance is used for classification.
+The model has been trained using an efficient few-shot learning technique that involves:
+1. Fine-tuning a [Sentence Transformer](https://www.sbert.net) with contrastive learning.
+2. Training a classification head with features from the fine-tuned Sentence Transformer.
+## Model Details
+### Model Description
+- **Model Type:** SetFit
+- **Sentence Transformer body:** [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5)
+- **Classification head:** a [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance
+- **Maximum Sequence Length:** 512 tokens
+- **Number of Classes:** 2 classes
+<!-- - **Training Dataset:** [Unknown](https://huggingface.co/datasets/unknown) -->
+<!-- - **Language:** Unknown -->
+<!-- - **License:** Unknown -->
+### Model Sources
+- **Repository:** [SetFit on GitHub](https://github.com/huggingface/setfit)
+- **Paper:** [Efficient Few-Shot Learning Without Prompts](https://arxiv.org/abs/2209.11055)
+- **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit)
+### Model Labels
+| Label | Examples                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
+|:------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| 1     | <ul><li>"The answer provided is comprehensive and directly addresses the question. Here is the reasoning:\n\n1. **Context Grounding:** The answer precisely matches the details provided in the document. Patricia Wallace's various roles, including managing a clothing closet, overseeing a food pantry, coordinating the food backpack program, and leading the Intervention Support Team, are well-supported by the text.\n   \n2. **Relevance:** The answer is entirely relevant to the question, as it lists the specific roles and responsibilities of Patricia Wallace at Oak View Elementary as outlined in the document.\n\n3. **Conciseness:** The answer is clear and focused, listing the relevant roles and responsibilities without unnecessaryinformation.\n\nTherefore, the evaluation is:"</li><li>"Reasoning:\n1. Context Grounding: The answer is well supported by the document provided. It details the necessary steps to administer a saline solution to a baby, which is a method found within the source text.\n2. Relevance: The answer focuses specifically on treating a baby's cough, directly addressing the question asked.\n3. Conciseness: The answer is clear and to the point, providing concrete steps without unnecessary information. However, the answer could have been made even more concise by avoiding repetitions about the saline solution preparation.\n\nFinal evaluation:"</li><li>'Reasoning:\n1. Context Grounding: The answer provided accurately reflects the information in the document, describing the symptoms, risk factors, and necessary actions if toxic shock syndrome (TSS) is suspected.\n2. Relevance: The answer directly addresses the question asked, focusing on how to recognize TSS and what to do if you suspect you have it.\n3. Conciseness: The answer effectively condenses the necessary information into a coherent, straightforward explanation without extraneous details.\n\nFinal Evaluation:'</li></ul> |
+| 0     | <ul><li>'Evaluation:\nThe answer provided incorrectly identifies the creation of a "literature hall" instead of a "science hall" as mentioned in the document. The answer also correctly attributes the oversight to Fr. Zahm, but this information is related to the wrong type of hall as per the document.\n\n1. Context Grounding: The document specifically states that a "Science Hall" was built under the direction of Fr. Zahm in 1883, not a literature hall.\n2. Relevance: The answer partially addresses the question correctly by mentioning Fr. Zahm, but it misidentifies the type of hall constructed.\n3. Conciseness: The answer is concise but includesincorrect information.\n\nThe final evaluation:'</li><li>'Reasoning:\n1. Context Grounding: The document supports that Gregory Johnson is the CEO of Franklin Templeton Investments and provides sufficient context about his role and relation to the company.\n2. Relevance: The answer directly addresses the question about the CEO of Franklin Templeton Investments.\n3. Conciseness: The answer presents the information clearly and succinctly without unnecessary details.\n\nFinal Result: ****'</li><li>'The answer correctly identifies that retired priests and brothers live at Fatima House. However, the additional information about the rare collection of ancient religious manuscripts at Fatima House is not supported by the document, making it an irrelevant addition. This deviates from the principle of conciseness and relevance to the specific question asked.\n\nFinal evaluation:'</li></ul>                                                                                                                                                                                                                                                                                                                                                                                      |
+## Evaluation
+### Metrics
+| Label   | Accuracy |
+|:--------|:---------|
+| **all** | 0.8649   |
+## Uses
+### Direct Use for Inference
+First install the SetFit library:
+```bash
+pip install setfit
+```
+Then you can load this model and run inference.
+```python
+from setfit import SetFitModel
+# Download from the 🤗 Hub
+model = SetFitModel.from_pretrained("Netta1994/setfit_baai_wikisum_gpt-4o_cot-few_shot-instructions_remove_final_evaluation_e1_large")
+# Run inference
+preds = model("Reasoning:
+1. Context Grounding: The provided answer reflects the key identifiers of a funnel spider found in the document, such as the dark brown or black body, hard shiny carapace, and large fangs.
+2. Relevance: The answer directly addresses the question of how to identify a funnel spider with relevant details.
+3. Conciseness: The answer is clear, with pertinent details, and avoids unnecessary information.
+Final evaluation:")
+```
+<!--
+### Downstream Use
+*List how someone could finetune this model on their own dataset.*
+-->
+<!--
+### Out-of-Scope Use
+*List how the model may foreseeably be misused and address what users ought not to do with the model.*
+-->
+<!--
+## Bias, Risks and Limitations
+*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
+-->
+<!--
+### Recommendations
+*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
+-->
+## Training Details
+### Training Set Metrics
+| Training set | Min | Median  | Max |
+|:-------------|:----|:--------|:----|
+| Word count   | 11  | 76.2020 | 196 |
+| Label | Training Sample Count |
+|:------|:----------------------|
+| 0     | 94                    |
+| 1     | 104                   |
+### Training Hyperparameters
+- batch_size: (16, 16)
+- num_epochs: (1, 1)
+- max_steps: -1
+- sampling_strategy: oversampling
+- num_iterations: 20
+- body_learning_rate: (2e-05, 2e-05)
+- head_learning_rate: 2e-05
+- loss: CosineSimilarityLoss
+- distance_metric: cosine_distance
+- margin: 0.25
+- end_to_end: False
+- use_amp: False
+- warmup_proportion: 0.1
+- l2_weight: 0.01
+- seed: 42
+- eval_max_steps: -1
+- load_best_model_at_end: False
+### Training Results
+| Epoch  | Step | Training Loss | Validation Loss |
+|:------:|:----:|:-------------:|:---------------:|
+| 0.0020 | 1    | 0.2119        | -               |
+| 0.1010 | 50   | 0.255         | -               |
+| 0.2020 | 100  | 0.1703        | -               |
+| 0.3030 | 150  | 0.0611        | -               |
+| 0.4040 | 200  | 0.0351        | -               |
+| 0.5051 | 250  | 0.0197        | -               |
+| 0.6061 | 300  | 0.0172        | -               |
+| 0.7071 | 350  | 0.0109        | -               |
+| 0.8081 | 400  | 0.0108        | -               |
+| 0.9091 | 450  | 0.0072        | -               |
+### Framework Versions
+- Python: 3.10.14
+- SetFit: 1.1.0
+- Sentence Transformers: 3.1.1
+- Transformers: 4.44.0
+- PyTorch: 2.4.0+cu121
+- Datasets: 3.0.0
+- Tokenizers: 0.19.1
+## Citation
+### BibTeX
+```bibtex
+@article{https://doi.org/10.48550/arxiv.2209.11055,
+    doi = {10.48550/ARXIV.2209.11055},
+    url = {https://arxiv.org/abs/2209.11055},
+    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
+    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
+    title = {Efficient Few-Shot Learning Without Prompts},
+    publisher = {arXiv},
+    year = {2022},
+    copyright = {Creative Commons Attribution 4.0 International}
+}
+```
+<!--
+## Glossary
+*Clearly define terms in order to be accessible across audiences.*
+-->
+<!--
+## Model Card Authors
+*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
+-->
+<!--
+## Model Card Contact
+*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
+-->

config.json ADDED Viewed

	@@ -0,0 +1,32 @@

+{
+  "_name_or_path": "BAAI/bge-base-en-v1.5",
+  "architectures": [
+    "BertModel"
+  ],
+  "attention_probs_dropout_prob": 0.1,
+  "classifier_dropout": null,
+  "gradient_checkpointing": false,
+  "hidden_act": "gelu",
+  "hidden_dropout_prob": 0.1,
+  "hidden_size": 768,
+  "id2label": {
+    "0": "LABEL_0"
+  },
+  "initializer_range": 0.02,
+  "intermediate_size": 3072,
+  "label2id": {
+    "LABEL_0": 0
+  },
+  "layer_norm_eps": 1e-12,
+  "max_position_embeddings": 512,
+  "model_type": "bert",
+  "num_attention_heads": 12,
+  "num_hidden_layers": 12,
+  "pad_token_id": 0,
+  "position_embedding_type": "absolute",
+  "torch_dtype": "float32",
+  "transformers_version": "4.44.0",
+  "type_vocab_size": 2,
+  "use_cache": true,
+  "vocab_size": 30522
+}

config_sentence_transformers.json ADDED Viewed

	@@ -0,0 +1,10 @@

+{
+  "__version__": {
+    "sentence_transformers": "3.1.1",
+    "transformers": "4.44.0",
+    "pytorch": "2.4.0+cu121"
+  },
+  "prompts": {},
+  "default_prompt_name": null,
+  "similarity_fn_name": null
+}

config_setfit.json ADDED Viewed

	@@ -0,0 +1,4 @@

+{
+  "normalize_embeddings": false,
+  "labels": null
+}

model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:4fc11758b8fdb64cf479832f3e59f5cb86764aa957e5c8560f706f9c9ca7eb8e
+size 437951328

model_head.pkl ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:a294344b9f3b56fd99a4984faf66fe39b0fb64afc86be02facb476f4320b6622
+size 7007

modules.json ADDED Viewed

	@@ -0,0 +1,20 @@

+[
+  {
+    "idx": 0,
+    "name": "0",
+    "path": "",
+    "type": "sentence_transformers.models.Transformer"
+  },
+  {
+    "idx": 1,
+    "name": "1",
+    "path": "1_Pooling",
+    "type": "sentence_transformers.models.Pooling"
+  },
+  {
+    "idx": 2,
+    "name": "2",
+    "path": "2_Normalize",
+    "type": "sentence_transformers.models.Normalize"
+  }
+]

sentence_bert_config.json ADDED Viewed

	@@ -0,0 +1,4 @@

+{
+  "max_seq_length": 512,
+  "do_lower_case": true
+}

special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,37 @@

+{
+  "cls_token": {
+    "content": "[CLS]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "mask_token": {
+    "content": "[MASK]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "pad_token": {
+    "content": "[PAD]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "sep_token": {
+    "content": "[SEP]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "unk_token": {
+    "content": "[UNK]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  }
+}

tokenizer.json ADDED Viewed

The diff for this file is too large to render. See raw diff

tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,57 @@

+{
+  "added_tokens_decoder": {
+    "0": {
+      "content": "[PAD]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "100": {
+      "content": "[UNK]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "101": {
+      "content": "[CLS]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "102": {
+      "content": "[SEP]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "103": {
+      "content": "[MASK]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    }
+  },
+  "clean_up_tokenization_spaces": true,
+  "cls_token": "[CLS]",
+  "do_basic_tokenize": true,
+  "do_lower_case": true,
+  "mask_token": "[MASK]",
+  "model_max_length": 512,
+  "never_split": null,
+  "pad_token": "[PAD]",
+  "sep_token": "[SEP]",
+  "strip_accents": null,
+  "tokenize_chinese_chars": true,
+  "tokenizer_class": "BertTokenizer",
+  "unk_token": "[UNK]"
+}

vocab.txt ADDED Viewed

The diff for this file is too large to render. See raw diff