kwang123's picture
Add SetFit model
858eab2 verified
metadata
library_name: setfit
tags:
  - setfit
  - sentence-transformers
  - text-classification
  - generated_from_setfit_trainer
metrics:
  - accuracy
  - weighted precision
  - weighted recall
  - weighted f1
  - macro precision
  - macro recall
  - macro f1
widget:
  - text: Roles can be assigned to a user account for individual products.
  - text: >-
      The number of active Subscription Versions in a sample to be monitored by
      the NPAC SMS.
  - text: 'The visual representation of an SDT or a part of an SDT. '
  - text: >-
      Open Society Institute Guide to Institutional Repository Software, 3rd ed.
      (2004)
  - text: >-
      The Application/Delete menu item shall provide an interface for deleting
      an application and all the files in the application directory. 
pipeline_tag: text-classification
inference: true
base_model: sentence-transformers/all-roberta-large-v1
model-index:
  - name: SetFit with sentence-transformers/all-roberta-large-v1
    results:
      - task:
          type: text-classification
          name: Text Classification
        dataset:
          name: Unknown
          type: unknown
          split: test
        metrics:
          - type: accuracy
            value: 0.7621000820344545
            name: Accuracy
          - type: weighted precision
            value: 0.7627752679232598
            name: Weighted Precision
          - type: weighted recall
            value: 0.7621000820344545
            name: Weighted Recall
          - type: weighted f1
            value: 0.7621663772102192
            name: Weighted F1
          - type: macro precision
            value: 0.7621734718049769
            name: Macro Precision
          - type: macro recall
            value: 0.7624659767698817
            name: Macro Recall
          - type: macro f1
            value: 0.7620481988534211
            name: Macro F1

SetFit with sentence-transformers/all-roberta-large-v1

This is a SetFit model that can be used for Text Classification. This SetFit model uses sentence-transformers/all-roberta-large-v1 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Sources

Model Labels

Label Examples
1
  • 'The matrix dimensions are fixed, and are the same when displaying departments or categories.'
  • 'The Clarus program shall provide for customer service.'
  • 'NPAC SMS shall identify the originator of any accessible system resources.'
0
  • 'A search pattern is a string w such that w is a sub-string of a string α and α is a string derived from some non- terminal β in the target grammar.'
  • 'Normally only one or two parties are engaged in operation and maintenance of the wind turbine(s), typically the owner and the operation and maintenance organisation, which in some cases is one and the same.'
  • 'TASE-2 (ICCP) resides on layer 7 in the OSI-model and is an MMS companion standard, that is, the general MMS services have been particularised for telecontrol applications.'

Evaluation

Metrics

Label Accuracy Weighted Precision Weighted Recall Weighted F1 Macro Precision Macro Recall Macro F1
all 0.7621 0.7628 0.7621 0.7622 0.7622 0.7625 0.7620

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("kwang123/roberta-large-setfit-ReqORNot")
# Run inference
preds = model("The visual representation of an SDT or a part of an SDT. ")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 5 21.7708 46
Label Training Sample Count
0 24
1 24

Training Hyperparameters

  • batch_size: (8, 8)
  • num_epochs: (10, 10)
  • max_steps: -1
  • sampling_strategy: oversampling
  • body_learning_rate: (2e-05, 1e-05)
  • head_learning_rate: 0.01
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.25
  • end_to_end: False
  • use_amp: False
  • warmup_proportion: 0.1
  • seed: 42
  • eval_max_steps: -1
  • load_best_model_at_end: False

Training Results

Epoch Step Training Loss Validation Loss
0.0067 1 0.3795 -
0.3333 50 0.298 -
0.6667 100 0.0025 -
1.0 150 0.0002 -
1.3333 200 0.0002 -
1.6667 250 0.0001 -
2.0 300 0.0001 -
2.3333 350 0.0001 -
2.6667 400 0.0001 -
3.0 450 0.0001 -
3.3333 500 0.0 -
3.6667 550 0.0 -
4.0 600 0.0 -
4.3333 650 0.0001 -
4.6667 700 0.0 -
5.0 750 0.0 -
5.3333 800 0.0 -
5.6667 850 0.0 -
6.0 900 0.0 -
6.3333 950 0.0001 -
6.6667 1000 0.0 -
7.0 1050 0.0 -
7.3333 1100 0.0 -
7.6667 1150 0.0 -
8.0 1200 0.0 -
8.3333 1250 0.0 -
8.6667 1300 0.0 -
9.0 1350 0.0 -
9.3333 1400 0.0 -
9.6667 1450 0.0 -
10.0 1500 0.0 -

Framework Versions

  • Python: 3.10.12
  • SetFit: 1.0.3
  • Sentence Transformers: 2.5.1
  • Transformers: 4.38.1
  • PyTorch: 2.1.0+cu121
  • Datasets: 2.18.0
  • Tokenizers: 0.15.2

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}