Edit model card

SetFit with sentence-transformers/all-mpnet-base-v2

This is a SetFit model that can be used for Text Classification. This SetFit model uses sentence-transformers/all-mpnet-base-v2 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Sources

Model Labels

Label Examples
0
  • "Was '80s New #Wave a #Casualty of #AIDS?: Tweet And Since they\x89Ûªd grown up watching David\x89Û_ http://t.co/qBecjli7cx"
  • "@CharlesDagnall He's getting 50 here I think. Salt. Wounds. Rub. In."
  • 'Navy sidelines 3 newest subs http://t.co/gpVZV0249Y'
1
  • 'The Latest: More Homes Razed by Northern California Wildfire - ABC News http://t.co/bKsYymvIsg #GN'
  • '@Durban_Knight Rescuers are searching for hundreds of migrants in the Mediterranean after a boat carr... http://t.co/cWCVBuBs01 @Nosy_Be'
  • 'NEMA Ekiti distributed relief materials to affected victims of Rain/Windstorm disaster at Ode-Ekiti in Gbonyin LGA.'

Evaluation

Metrics

Label Accuracy
all 0.8172

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("pEpOo/catastrophy5")
# Run inference
preds = model("Stuart Broad Takes Eight Before Joe Root Runs Riot Against Aussies")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 1 14.9796 54
Label Training Sample Count
0 1732
1 1313

Training Hyperparameters

  • batch_size: (16, 16)
  • num_epochs: (1, 1)
  • max_steps: -1
  • sampling_strategy: oversampling
  • num_iterations: 20
  • body_learning_rate: (2e-05, 2e-05)
  • head_learning_rate: 2e-05
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.25
  • end_to_end: False
  • use_amp: False
  • warmup_proportion: 0.1
  • seed: 42
  • eval_max_steps: -1
  • load_best_model_at_end: False

Training Results

Epoch Step Training Loss Validation Loss
0.0001 1 0.3383 -
0.0066 50 0.352 -
0.0131 100 0.3529 -
0.0197 150 0.2286 -
0.0263 200 0.2654 -
0.0328 250 0.2892 -
0.0394 300 0.1808 -
0.0460 350 0.2056 -
0.0525 400 0.0863 -
0.0591 450 0.2034 -
0.0657 500 0.1339 -
0.0722 550 0.1022 -
0.0788 600 0.1083 -
0.0854 650 0.1035 -
0.0919 700 0.1201 -
0.0985 750 0.0626 -
0.1051 800 0.1257 -
0.1117 850 0.1543 -
0.1182 900 0.0367 -
0.1248 950 0.1749 -
0.1314 1000 0.0553 -
0.1379 1050 0.0836 -
0.1445 1100 0.0161 -
0.1511 1150 0.1149 -
0.1576 1200 0.1144 -
0.1642 1250 0.0028 -
0.1708 1300 0.0037 -
0.1773 1350 0.1769 -
0.1839 1400 0.0172 -
0.1905 1450 0.0397 -
0.1970 1500 0.0645 -
0.2036 1550 0.0659 -
0.2102 1600 0.0014 -
0.2167 1650 0.0016 -
0.2233 1700 0.0729 -
0.2299 1750 0.0072 -
0.2364 1800 0.0175 -
0.2430 1850 0.0278 -
0.2496 1900 0.0537 -
0.2561 1950 0.0038 -
0.2627 2000 0.087 -
0.2693 2050 0.0459 -
0.2758 2100 0.0169 -
0.2824 2150 0.0112 -
0.2890 2200 0.001 -
0.2955 2250 0.0204 -
0.3021 2300 0.0796 -
0.3087 2350 0.0592 -
0.3153 2400 0.0003 -
0.3218 2450 0.0033 -
0.3284 2500 0.0309 -
0.3350 2550 0.0065 -
0.3415 2600 0.002 -
0.3481 2650 0.0076 -
0.3547 2700 0.0008 -
0.3612 2750 0.0023 -
0.3678 2800 0.0028 -
0.3744 2850 0.0171 -
0.3809 2900 0.0011 -
0.3875 2950 0.0015 -
0.3941 3000 0.0468 -
0.4006 3050 0.0075 -
0.4072 3100 0.0009 -
0.4138 3150 0.0334 -
0.4203 3200 0.0002 -
0.4269 3250 0.0001 -
0.4335 3300 0.0002 -
0.4400 3350 0.0001 -
0.4466 3400 0.021 -
0.4532 3450 0.0043 -
0.4597 3500 0.0084 -
0.4663 3550 0.0009 -
0.4729 3600 0.0033 -
0.4794 3650 0.0035 -
0.4860 3700 0.0004 -
0.4926 3750 0.0297 -
0.4991 3800 0.0004 -
0.5057 3850 0.0011 -
0.5123 3900 0.0238 -
0.5188 3950 0.0248 -
0.5254 4000 0.0293 -
0.5320 4050 0.0365 -
0.5386 4100 0.0261 -
0.5451 4150 0.0469 -
0.5517 4200 0.0098 -
0.5583 4250 0.0002 -
0.5648 4300 0.0236 -
0.5714 4350 0.0001 -
0.5780 4400 0.0001 -
0.5845 4450 0.0001 -
0.5911 4500 0.0138 -
0.5977 4550 0.0116 -
0.6042 4600 0.0003 -
0.6108 4650 0.0003 -
0.6174 4700 0.0001 -
0.6239 4750 0.0 -
0.6305 4800 0.0246 -
0.6371 4850 0.0001 -
0.6436 4900 0.0543 -
0.6502 4950 0.0001 -
0.6568 5000 0.0093 -
0.6633 5050 0.0001 -
0.6699 5100 0.0 -
0.6765 5150 0.0002 -
0.6830 5200 0.0001 -
0.6896 5250 0.0372 -
0.6962 5300 0.0 -
0.7027 5350 0.0001 -
0.7093 5400 0.0001 -
0.7159 5450 0.0003 -
0.7224 5500 0.0004 -
0.7290 5550 0.0001 -
0.7356 5600 0.0 -
0.7422 5650 0.0 -
0.7487 5700 0.0001 -
0.7553 5750 0.0001 -
0.7619 5800 0.0 -
0.7684 5850 0.0 -
0.7750 5900 0.0 -
0.7816 5950 0.0 -
0.7881 6000 0.0 -
0.7947 6050 0.0 -
0.8013 6100 0.0 -
0.8078 6150 0.0001 -
0.8144 6200 0.0001 -
0.8210 6250 0.0 -
0.8275 6300 0.0 -
0.8341 6350 0.0 -
0.8407 6400 0.0002 -
0.8472 6450 0.0 -
0.8538 6500 0.0001 -
0.8604 6550 0.0 -
0.8669 6600 0.0001 -
0.8735 6650 0.0001 -
0.8801 6700 0.0 -
0.8866 6750 0.0 -
0.8932 6800 0.0373 -
0.8998 6850 0.0 -
0.9063 6900 0.0 -
0.9129 6950 0.0272 -
0.9195 7000 0.0 -
0.9260 7050 0.0 -
0.9326 7100 0.0001 -
0.9392 7150 0.0 -
0.9458 7200 0.0002 -
0.9523 7250 0.0001 -
0.9589 7300 0.0 -
0.9655 7350 0.0 -
0.9720 7400 0.0 -
0.9786 7450 0.0001 -
0.9852 7500 0.0 -
0.9917 7550 0.0 -
0.9983 7600 0.0 -

Framework Versions

  • Python: 3.10.12
  • SetFit: 1.0.1
  • Sentence Transformers: 2.2.2
  • Transformers: 4.35.2
  • PyTorch: 2.1.0+cu121
  • Datasets: 2.15.0
  • Tokenizers: 0.15.0

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}
Downloads last month
2
Safetensors
Model size
109M params
Tensor type
F32
·

Finetuned from

Evaluation results