SetFit with sentence-transformers/all-mpnet-base-v2
This is a SetFit model that can be used for Text Classification. This SetFit model uses sentence-transformers/all-mpnet-base-v2 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.
The model has been trained using an efficient few-shot learning technique that involves:
- Fine-tuning a Sentence Transformer with contrastive learning.
- Training a classification head with features from the fine-tuned Sentence Transformer.
Model Details
Model Description
Model Sources
Model Labels
Label |
Examples |
0 |
- "Was '80s New #Wave a #Casualty of #AIDS?: Tweet And Since they\x89Ûªd grown up watching David\x89Û_ http://t.co/qBecjli7cx"
- "@CharlesDagnall He's getting 50 here I think. Salt. Wounds. Rub. In."
- 'Navy sidelines 3 newest subs http://t.co/gpVZV0249Y'
|
1 |
- 'The Latest: More Homes Razed by Northern California Wildfire - ABC News http://t.co/bKsYymvIsg #GN'
- '@Durban_Knight Rescuers are searching for hundreds of migrants in the Mediterranean after a boat carr... http://t.co/cWCVBuBs01 @Nosy_Be'
- 'NEMA Ekiti distributed relief materials to affected victims of Rain/Windstorm disaster at Ode-Ekiti in Gbonyin LGA.'
|
Evaluation
Metrics
Label |
Accuracy |
all |
0.8172 |
Uses
Direct Use for Inference
First install the SetFit library:
pip install setfit
Then you can load this model and run inference.
from setfit import SetFitModel
model = SetFitModel.from_pretrained("pEpOo/catastrophy5")
preds = model("Stuart Broad Takes Eight Before Joe Root Runs Riot Against Aussies")
Training Details
Training Set Metrics
Training set |
Min |
Median |
Max |
Word count |
1 |
14.9796 |
54 |
Label |
Training Sample Count |
0 |
1732 |
1 |
1313 |
Training Hyperparameters
- batch_size: (16, 16)
- num_epochs: (1, 1)
- max_steps: -1
- sampling_strategy: oversampling
- num_iterations: 20
- body_learning_rate: (2e-05, 2e-05)
- head_learning_rate: 2e-05
- loss: CosineSimilarityLoss
- distance_metric: cosine_distance
- margin: 0.25
- end_to_end: False
- use_amp: False
- warmup_proportion: 0.1
- seed: 42
- eval_max_steps: -1
- load_best_model_at_end: False
Training Results
Epoch |
Step |
Training Loss |
Validation Loss |
0.0001 |
1 |
0.3383 |
- |
0.0066 |
50 |
0.352 |
- |
0.0131 |
100 |
0.3529 |
- |
0.0197 |
150 |
0.2286 |
- |
0.0263 |
200 |
0.2654 |
- |
0.0328 |
250 |
0.2892 |
- |
0.0394 |
300 |
0.1808 |
- |
0.0460 |
350 |
0.2056 |
- |
0.0525 |
400 |
0.0863 |
- |
0.0591 |
450 |
0.2034 |
- |
0.0657 |
500 |
0.1339 |
- |
0.0722 |
550 |
0.1022 |
- |
0.0788 |
600 |
0.1083 |
- |
0.0854 |
650 |
0.1035 |
- |
0.0919 |
700 |
0.1201 |
- |
0.0985 |
750 |
0.0626 |
- |
0.1051 |
800 |
0.1257 |
- |
0.1117 |
850 |
0.1543 |
- |
0.1182 |
900 |
0.0367 |
- |
0.1248 |
950 |
0.1749 |
- |
0.1314 |
1000 |
0.0553 |
- |
0.1379 |
1050 |
0.0836 |
- |
0.1445 |
1100 |
0.0161 |
- |
0.1511 |
1150 |
0.1149 |
- |
0.1576 |
1200 |
0.1144 |
- |
0.1642 |
1250 |
0.0028 |
- |
0.1708 |
1300 |
0.0037 |
- |
0.1773 |
1350 |
0.1769 |
- |
0.1839 |
1400 |
0.0172 |
- |
0.1905 |
1450 |
0.0397 |
- |
0.1970 |
1500 |
0.0645 |
- |
0.2036 |
1550 |
0.0659 |
- |
0.2102 |
1600 |
0.0014 |
- |
0.2167 |
1650 |
0.0016 |
- |
0.2233 |
1700 |
0.0729 |
- |
0.2299 |
1750 |
0.0072 |
- |
0.2364 |
1800 |
0.0175 |
- |
0.2430 |
1850 |
0.0278 |
- |
0.2496 |
1900 |
0.0537 |
- |
0.2561 |
1950 |
0.0038 |
- |
0.2627 |
2000 |
0.087 |
- |
0.2693 |
2050 |
0.0459 |
- |
0.2758 |
2100 |
0.0169 |
- |
0.2824 |
2150 |
0.0112 |
- |
0.2890 |
2200 |
0.001 |
- |
0.2955 |
2250 |
0.0204 |
- |
0.3021 |
2300 |
0.0796 |
- |
0.3087 |
2350 |
0.0592 |
- |
0.3153 |
2400 |
0.0003 |
- |
0.3218 |
2450 |
0.0033 |
- |
0.3284 |
2500 |
0.0309 |
- |
0.3350 |
2550 |
0.0065 |
- |
0.3415 |
2600 |
0.002 |
- |
0.3481 |
2650 |
0.0076 |
- |
0.3547 |
2700 |
0.0008 |
- |
0.3612 |
2750 |
0.0023 |
- |
0.3678 |
2800 |
0.0028 |
- |
0.3744 |
2850 |
0.0171 |
- |
0.3809 |
2900 |
0.0011 |
- |
0.3875 |
2950 |
0.0015 |
- |
0.3941 |
3000 |
0.0468 |
- |
0.4006 |
3050 |
0.0075 |
- |
0.4072 |
3100 |
0.0009 |
- |
0.4138 |
3150 |
0.0334 |
- |
0.4203 |
3200 |
0.0002 |
- |
0.4269 |
3250 |
0.0001 |
- |
0.4335 |
3300 |
0.0002 |
- |
0.4400 |
3350 |
0.0001 |
- |
0.4466 |
3400 |
0.021 |
- |
0.4532 |
3450 |
0.0043 |
- |
0.4597 |
3500 |
0.0084 |
- |
0.4663 |
3550 |
0.0009 |
- |
0.4729 |
3600 |
0.0033 |
- |
0.4794 |
3650 |
0.0035 |
- |
0.4860 |
3700 |
0.0004 |
- |
0.4926 |
3750 |
0.0297 |
- |
0.4991 |
3800 |
0.0004 |
- |
0.5057 |
3850 |
0.0011 |
- |
0.5123 |
3900 |
0.0238 |
- |
0.5188 |
3950 |
0.0248 |
- |
0.5254 |
4000 |
0.0293 |
- |
0.5320 |
4050 |
0.0365 |
- |
0.5386 |
4100 |
0.0261 |
- |
0.5451 |
4150 |
0.0469 |
- |
0.5517 |
4200 |
0.0098 |
- |
0.5583 |
4250 |
0.0002 |
- |
0.5648 |
4300 |
0.0236 |
- |
0.5714 |
4350 |
0.0001 |
- |
0.5780 |
4400 |
0.0001 |
- |
0.5845 |
4450 |
0.0001 |
- |
0.5911 |
4500 |
0.0138 |
- |
0.5977 |
4550 |
0.0116 |
- |
0.6042 |
4600 |
0.0003 |
- |
0.6108 |
4650 |
0.0003 |
- |
0.6174 |
4700 |
0.0001 |
- |
0.6239 |
4750 |
0.0 |
- |
0.6305 |
4800 |
0.0246 |
- |
0.6371 |
4850 |
0.0001 |
- |
0.6436 |
4900 |
0.0543 |
- |
0.6502 |
4950 |
0.0001 |
- |
0.6568 |
5000 |
0.0093 |
- |
0.6633 |
5050 |
0.0001 |
- |
0.6699 |
5100 |
0.0 |
- |
0.6765 |
5150 |
0.0002 |
- |
0.6830 |
5200 |
0.0001 |
- |
0.6896 |
5250 |
0.0372 |
- |
0.6962 |
5300 |
0.0 |
- |
0.7027 |
5350 |
0.0001 |
- |
0.7093 |
5400 |
0.0001 |
- |
0.7159 |
5450 |
0.0003 |
- |
0.7224 |
5500 |
0.0004 |
- |
0.7290 |
5550 |
0.0001 |
- |
0.7356 |
5600 |
0.0 |
- |
0.7422 |
5650 |
0.0 |
- |
0.7487 |
5700 |
0.0001 |
- |
0.7553 |
5750 |
0.0001 |
- |
0.7619 |
5800 |
0.0 |
- |
0.7684 |
5850 |
0.0 |
- |
0.7750 |
5900 |
0.0 |
- |
0.7816 |
5950 |
0.0 |
- |
0.7881 |
6000 |
0.0 |
- |
0.7947 |
6050 |
0.0 |
- |
0.8013 |
6100 |
0.0 |
- |
0.8078 |
6150 |
0.0001 |
- |
0.8144 |
6200 |
0.0001 |
- |
0.8210 |
6250 |
0.0 |
- |
0.8275 |
6300 |
0.0 |
- |
0.8341 |
6350 |
0.0 |
- |
0.8407 |
6400 |
0.0002 |
- |
0.8472 |
6450 |
0.0 |
- |
0.8538 |
6500 |
0.0001 |
- |
0.8604 |
6550 |
0.0 |
- |
0.8669 |
6600 |
0.0001 |
- |
0.8735 |
6650 |
0.0001 |
- |
0.8801 |
6700 |
0.0 |
- |
0.8866 |
6750 |
0.0 |
- |
0.8932 |
6800 |
0.0373 |
- |
0.8998 |
6850 |
0.0 |
- |
0.9063 |
6900 |
0.0 |
- |
0.9129 |
6950 |
0.0272 |
- |
0.9195 |
7000 |
0.0 |
- |
0.9260 |
7050 |
0.0 |
- |
0.9326 |
7100 |
0.0001 |
- |
0.9392 |
7150 |
0.0 |
- |
0.9458 |
7200 |
0.0002 |
- |
0.9523 |
7250 |
0.0001 |
- |
0.9589 |
7300 |
0.0 |
- |
0.9655 |
7350 |
0.0 |
- |
0.9720 |
7400 |
0.0 |
- |
0.9786 |
7450 |
0.0001 |
- |
0.9852 |
7500 |
0.0 |
- |
0.9917 |
7550 |
0.0 |
- |
0.9983 |
7600 |
0.0 |
- |
Framework Versions
- Python: 3.10.12
- SetFit: 1.0.1
- Sentence Transformers: 2.2.2
- Transformers: 4.35.2
- PyTorch: 2.1.0+cu121
- Datasets: 2.15.0
- Tokenizers: 0.15.0
Citation
BibTeX
@article{https://doi.org/10.48550/arxiv.2209.11055,
doi = {10.48550/ARXIV.2209.11055},
url = {https://arxiv.org/abs/2209.11055},
author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
title = {Efficient Few-Shot Learning Without Prompts},
publisher = {arXiv},
year = {2022},
copyright = {Creative Commons Attribution 4.0 International}
}