SetFit with sentence-transformers/paraphrase-mpnet-base-v2

This is a SetFit model that can be used for Text Classification. This SetFit model uses sentence-transformers/paraphrase-mpnet-base-v2 as the Sentence Transformer embedding model. A SetFitHead instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Sources

Model Labels

Label Examples
6
  • 'If you were especially helpful in a corrupt scheme you received not just cash in a bag , but equity . '
  • 'Two American companies reached deals for fields auctioned in June . '
  • 'Let me prove it , Phil . '
2
  • 'This building shook like hell and it kept getting stronger . '
  • 'Now you could ask me , why should the user mind about MathML ? '
  • 'The report and a casebook of initiatives will be published in 1996 and provide the backdrop for a conference to be staged in Autumn , 1996 . '
3
  • 'The tumor , he suggested , developed when the second , normal copy also was damaged . '
  • 'Proper English bells are started off in rounds , from the highest-pitched bell to the lowest -- a simple descending scale using , in larger churches , as many as 12 bells . '
  • 'Treatment should be delayed or discontinued , or the dose reduced , in patients whose blood counts are abnormal or who have certain other side effects . '
5
  • 'Schools that are structured in this way produce students with higher morale and superior academic performance . '
  • 'I got home , let the dogs into the house and noticed some sounds above my head , as if someone were walking on the roof , or upstairs . '
  • 'Give me your address . '
0
  • '-- Most important of all , schools should have principals with a large measure of authority over the faculty , the curriculum , and all matters of student discipline . '
  • 'For months the Johns Hopkins researchers , using gene probes , experimentally crawled down the length of chromosome 17 , looking for the smallest common bit of genetic material lost in all tumor cells . '
  • 'It explains how the Committee for Medicinal Products for Human Use ( CHMP ) assessed the studies performed , to reach their recommendations on how to use the medicine . '
4
  • 'In 2005 , the fear of invasion of the national territory by hordes of Polish plumbers was felt both on the Left and on the Right . '
  • 'Cerenia contains the active substance maropitant and is available as tablet or as solution for injection . '
  • 'The second quarter was more of the same , but the Alavan team opted for the inside game of Barac and the work of Eliyahu , who was greeted with whistles and applause at his return home , to continue increasing their lead by half-time ( 34-43 ) . '
1
  • 'The sound of bells is a net to draw people into the church , he says . '
  • 'Progressive education ( as it was once called ) is far more interesting and agreeable to teachers than is disciplined instruction . '
  • "The defense lawyers also claim , for example , that Mr. Hayes may have been prejudiced when Judge Blue declined to allow them to test potential jurors ' reactions by showing them grisly crime-scene photographs during jury selection . "

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("HelgeKn/SemEval-multi-class-v1-10")
# Run inference
preds = model("`` But why pay her bills ? ")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 5 25.8286 75
Label Training Sample Count
0 10
1 10
2 10
3 10
4 10
5 10
6 10

Training Hyperparameters

  • batch_size: (16, 16)
  • num_epochs: (4, 4)
  • max_steps: -1
  • sampling_strategy: oversampling
  • num_iterations: 20
  • body_learning_rate: (2e-05, 2e-05)
  • head_learning_rate: 2e-05
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.25
  • end_to_end: False
  • use_amp: False
  • warmup_proportion: 0.1
  • seed: 42
  • eval_max_steps: -1
  • load_best_model_at_end: False

Training Results

Epoch Step Training Loss Validation Loss
0.0057 1 0.2314 -
0.2857 50 0.218 -
0.5714 100 0.1161 -
0.8571 150 0.0559 -
1.1429 200 0.0087 -
1.4286 250 0.0029 -
1.7143 300 0.001 -
2.0 350 0.0006 -
2.2857 400 0.0011 -
2.5714 450 0.0009 -
2.8571 500 0.0005 -
3.1429 550 0.0006 -
3.4286 600 0.0004 -
3.7143 650 0.0003 -
4.0 700 0.0005 -

Framework Versions

  • Python: 3.9.13
  • SetFit: 1.0.1
  • Sentence Transformers: 2.2.2
  • Transformers: 4.36.0
  • PyTorch: 2.1.1+cpu
  • Datasets: 2.15.0
  • Tokenizers: 0.15.0

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}
Downloads last month
8
Safetensors
Model size
109M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for HelgeKn/SemEval-multi-class-v1-10

Finetuned
(250)
this model