SetFit with sentence-transformers/paraphrase-mpnet-base-v2

This is a SetFit model that can be used for Text Classification. This SetFit model uses sentence-transformers/paraphrase-mpnet-base-v2 as the Sentence Transformer embedding model. A OneVsRestClassifier instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Sources

Evaluation

Metrics

Label Accuracy
all 0.3217

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("amitprgx/setfit-categorization")
# Run inference
preds = model("300108")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 1 4.7197 10

Training Hyperparameters

  • batch_size: (8, 8)
  • num_epochs: (10, 10)
  • max_steps: -1
  • sampling_strategy: oversampling
  • num_iterations: 20
  • body_learning_rate: (2e-05, 2e-05)
  • head_learning_rate: 2e-05
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.25
  • end_to_end: False
  • use_amp: False
  • warmup_proportion: 0.1
  • seed: 42
  • eval_max_steps: -1
  • load_best_model_at_end: False

Training Results

Epoch Step Training Loss Validation Loss
0.0008 1 0.1444 -
0.0379 50 0.1563 -
0.0758 100 0.2163 -
0.1136 150 0.3125 -
0.1515 200 0.2152 -
0.1894 250 0.2731 -
0.2273 300 0.2788 -
0.2652 350 0.2315 -
0.3030 400 0.1847 -
0.3409 450 0.1253 -
0.3788 500 0.1363 -
0.4167 550 0.1816 -
0.4545 600 0.1957 -
0.4924 650 0.1931 -
0.5303 700 0.1392 -
0.5682 750 0.0613 -
0.6061 800 0.0403 -
0.6439 850 0.0796 -
0.6818 900 0.0661 -
0.7197 950 0.1207 -
0.7576 1000 0.0795 -
0.7955 1050 0.0439 -
0.8333 1100 0.0744 -
0.8712 1150 0.0972 -
0.9091 1200 0.0512 -
0.9470 1250 0.0335 -
0.9848 1300 0.0092 -
1.0227 1350 0.0489 -
1.0606 1400 0.0176 -
1.0985 1450 0.0302 -
1.1364 1500 0.0811 -
1.1742 1550 0.0181 -
1.2121 1600 0.0354 -
1.25 1650 0.0183 -
1.2879 1700 0.0167 -
1.3258 1750 0.006 -
1.3636 1800 0.0294 -
1.4015 1850 0.0342 -
1.4394 1900 0.005 -
1.4773 1950 0.0044 -
1.5152 2000 0.0069 -
1.5530 2050 0.0051 -
1.5909 2100 0.0375 -
1.6288 2150 0.0123 -
1.6667 2200 0.0058 -
1.7045 2250 0.0086 -
1.7424 2300 0.0141 -
1.7803 2350 0.0014 -
1.8182 2400 0.0047 -
1.8561 2450 0.0018 -
1.8939 2500 0.0063 -
1.9318 2550 0.0018 -
1.9697 2600 0.0032 -
2.0076 2650 0.001 -
2.0455 2700 0.0165 -
2.0833 2750 0.0773 -
2.1212 2800 0.0014 -
2.1591 2850 0.0105 -
2.1970 2900 0.0013 -
2.2348 2950 0.0009 -
2.2727 3000 0.0034 -
2.3106 3050 0.0013 -
2.3485 3100 0.0065 -
2.3864 3150 0.0008 -
2.4242 3200 0.1143 -
2.4621 3250 0.0036 -
2.5 3300 0.0254 -
2.5379 3350 0.0023 -
2.5758 3400 0.004 -
2.6136 3450 0.0034 -
2.6515 3500 0.0019 -
2.6894 3550 0.001 -
2.7273 3600 0.1044 -
2.7652 3650 0.0005 -
2.8030 3700 0.0955 -
2.8409 3750 0.0011 -
2.8788 3800 0.0018 -
2.9167 3850 0.0017 -
2.9545 3900 0.0007 -
2.9924 3950 0.001 -
3.0303 4000 0.0009 -
3.0682 4050 0.001 -
3.1061 4100 0.0035 -
3.1439 4150 0.0009 -
3.1818 4200 0.0009 -
3.2197 4250 0.0005 -
3.2576 4300 0.0011 -
3.2955 4350 0.0007 -
3.3333 4400 0.0007 -
3.3712 4450 0.0003 -
3.4091 4500 0.0008 -
3.4470 4550 0.0007 -
3.4848 4600 0.0004 -
3.5227 4650 0.0011 -
3.5606 4700 0.0009 -
3.5985 4750 0.0004 -
3.6364 4800 0.0006 -
3.6742 4850 0.0012 -
3.7121 4900 0.0004 -
3.75 4950 0.0003 -
3.7879 5000 0.0005 -
3.8258 5050 0.0007 -
3.8636 5100 0.0012 -
3.9015 5150 0.0003 -
3.9394 5200 0.0009 -
3.9773 5250 0.0003 -
4.0152 5300 0.0003 -
4.0530 5350 0.0005 -
4.0909 5400 0.0004 -
4.1288 5450 0.0003 -
4.1667 5500 0.0003 -
4.2045 5550 0.0011 -
4.2424 5600 0.0002 -
4.2803 5650 0.0004 -
4.3182 5700 0.0009 -
4.3561 5750 0.0003 -
4.3939 5800 0.0002 -
4.4318 5850 0.0008 -
4.4697 5900 0.0003 -
4.5076 5950 0.0004 -
4.5455 6000 0.0272 -
4.5833 6050 0.0012 -
4.6212 6100 0.0006 -
4.6591 6150 0.0005 -
4.6970 6200 0.0011 -
4.7348 6250 0.0003 -
4.7727 6300 0.0003 -
4.8106 6350 0.0026 -
4.8485 6400 0.0007 -
4.8864 6450 0.0002 -
4.9242 6500 0.0007 -
4.9621 6550 0.0004 -
5.0 6600 0.0002 -
5.0379 6650 0.0002 -
5.0758 6700 0.0003 -
5.1136 6750 0.0004 -
5.1515 6800 0.0007 -
5.1894 6850 0.0002 -
5.2273 6900 0.0002 -
5.2652 6950 0.0001 -
5.3030 7000 0.0003 -
5.3409 7050 0.0001 -
5.3788 7100 0.0002 -
5.4167 7150 0.0003 -
5.4545 7200 0.0006 -
5.4924 7250 0.0002 -
5.5303 7300 0.0002 -
5.5682 7350 0.0002 -
5.6061 7400 0.0004 -
5.6439 7450 0.0003 -
5.6818 7500 0.0002 -
5.7197 7550 0.0002 -
5.7576 7600 0.0002 -
5.7955 7650 0.0005 -
5.8333 7700 0.0013 -
5.8712 7750 0.0002 -
5.9091 7800 0.0015 -
5.9470 7850 0.0001 -
5.9848 7900 0.0002 -
6.0227 7950 0.0001 -
6.0606 8000 0.0015 -
6.0985 8050 0.0004 -
6.1364 8100 0.0373 -
6.1742 8150 0.0003 -
6.2121 8200 0.0002 -
6.25 8250 0.0003 -
6.2879 8300 0.0003 -
6.3258 8350 0.0003 -
6.3636 8400 0.0002 -
6.4015 8450 0.0001 -
6.4394 8500 0.0004 -
6.4773 8550 0.0002 -
6.5152 8600 0.0002 -
6.5530 8650 0.0002 -
6.5909 8700 0.0004 -
6.6288 8750 0.0002 -
6.6667 8800 0.0001 -
6.7045 8850 0.0003 -
6.7424 8900 0.0001 -
6.7803 8950 0.0002 -
6.8182 9000 0.0003 -
6.8561 9050 0.0002 -
6.8939 9100 0.0002 -
6.9318 9150 0.0001 -
6.9697 9200 0.0001 -
7.0076 9250 0.0002 -
7.0455 9300 0.0002 -
7.0833 9350 0.0002 -
7.1212 9400 0.0001 -
7.1591 9450 0.0002 -
7.1970 9500 0.0003 -
7.2348 9550 0.0005 -
7.2727 9600 0.0002 -
7.3106 9650 0.0002 -
7.3485 9700 0.0002 -
7.3864 9750 0.0002 -
7.4242 9800 0.0002 -
7.4621 9850 0.0001 -
7.5 9900 0.0001 -
7.5379 9950 0.0002 -
7.5758 10000 0.0001 -
7.6136 10050 0.0001 -
7.6515 10100 0.0001 -
7.6894 10150 0.0002 -
7.7273 10200 0.0002 -
7.7652 10250 0.0001 -
7.8030 10300 0.0002 -
7.8409 10350 0.0003 -
7.8788 10400 0.0002 -
7.9167 10450 0.0002 -
7.9545 10500 0.0001 -
7.9924 10550 0.0002 -
8.0303 10600 0.0002 -
8.0682 10650 0.0002 -
8.1061 10700 0.0002 -
8.1439 10750 0.0001 -
8.1818 10800 0.0001 -
8.2197 10850 0.0001 -
8.2576 10900 0.0001 -
8.2955 10950 0.0001 -
8.3333 11000 0.0002 -
8.3712 11050 0.0007 -
8.4091 11100 0.0001 -
8.4470 11150 0.0002 -
8.4848 11200 0.0001 -
8.5227 11250 0.0002 -
8.5606 11300 0.0001 -
8.5985 11350 0.0001 -
8.6364 11400 0.0001 -
8.6742 11450 0.0001 -
8.7121 11500 0.0002 -
8.75 11550 0.0001 -
8.7879 11600 0.0001 -
8.8258 11650 0.0001 -
8.8636 11700 0.0001 -
8.9015 11750 0.0001 -
8.9394 11800 0.0001 -
8.9773 11850 0.0001 -
9.0152 11900 0.0001 -
9.0530 11950 0.0001 -
9.0909 12000 0.0001 -
9.1288 12050 0.0001 -
9.1667 12100 0.0002 -
9.2045 12150 0.0001 -
9.2424 12200 0.0001 -
9.2803 12250 0.0002 -
9.3182 12300 0.0002 -
9.3561 12350 0.0002 -
9.3939 12400 0.0001 -
9.4318 12450 0.0003 -
9.4697 12500 0.0001 -
9.5076 12550 0.0001 -
9.5455 12600 0.0001 -
9.5833 12650 0.0002 -
9.6212 12700 0.0001 -
9.6591 12750 0.0002 -
9.6970 12800 0.0002 -
9.7348 12850 0.0001 -
9.7727 12900 0.0001 -
9.8106 12950 0.0001 -
9.8485 13000 0.0001 -
9.8864 13050 0.0001 -
9.9242 13100 0.0001 -
9.9621 13150 0.0001 -
10.0 13200 0.0002 -

Framework Versions

  • Python: 3.11.8
  • SetFit: 1.1.0.dev0
  • Sentence Transformers: 2.6.1
  • Transformers: 4.39.3
  • PyTorch: 1.13.1+cu117
  • Datasets: 2.19.0
  • Tokenizers: 0.15.2

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}
Downloads last month
2
Safetensors
Model size
109M params
Tensor type
F32
·
Inference Examples
Inference API (serverless) has been turned off for this model.

Model tree for amitprgx/setfit-categorization

Finetuned
(260)
this model

Evaluation results