SetFit with intfloat/multilingual-e5-small

This is a SetFit model that can be used for Text Classification. This SetFit model uses intfloat/multilingual-e5-small as the Sentence Transformer embedding model. A SetFitHead instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Sources

Model Labels

Label Examples
1
  • 'query: ਚੰਗਾ ਜੀ, ਫਿਰ ਮਿਲਦੇ ਹਾਂ.'
  • 'query: Agur, gero arte.'
  • "query: Me'n vaig ara."
0
  • 'query: Dobro, hvala. Kaj pa ti?'
  • 'query: हाँ अगली बार जब तुम जाओ मुझे भी ले चलो मुझे भी प्रकृति में और गतिविधियाँ करनी हैं'
  • 'query: Mirë, faleminderit. Po ju?'

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("setfit_model_id")
# Run inference
preds = model("query: Ναι, ας πάμε!")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 2 7.4364 21
Label Training Sample Count
0 292
1 290

Training Hyperparameters

  • batch_size: (16, 2)
  • num_epochs: (1, 16)
  • max_steps: -1
  • sampling_strategy: undersampling
  • body_learning_rate: (1e-05, 1e-05)
  • head_learning_rate: 0.001
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.1
  • end_to_end: False
  • use_amp: False
  • warmup_proportion: 0.1
  • seed: 42
  • run_name: intfloat/multilingual-e5-small
  • eval_max_steps: -1
  • load_best_model_at_end: True

Training Results

Epoch Step Training Loss Validation Loss
0.0001 1 0.3645 -
0.0047 50 0.3527 -
0.0094 100 0.3424 0.3165
0.0142 150 0.3108 -
0.0189 200 0.2684 0.2215
0.0236 250 0.2197 -
0.0283 300 0.1707 0.1792
0.0331 350 0.1501 -
0.0378 400 0.0865 0.1607
0.0425 450 0.0534 -
0.0472 500 0.0307 0.1519
0.0520 550 0.0342 -
0.0567 600 0.0078 0.1478
0.0614 650 0.0144 -
0.0661 700 0.0658 0.1399
0.0709 750 0.0021 -
0.0756 800 0.0009 0.1512
0.0803 850 0.0005 -
0.0850 900 0.0018 0.1516
0.0897 950 0.0011 -
0.0945 1000 0.0012 0.1541
0.0992 1050 0.0003 -
0.1039 1100 0.0003 0.1415
0.1086 1150 0.0003 -
0.1134 1200 0.0002 0.1442
0.1181 1250 0.0006 -
0.1228 1300 0.0002 0.1298
0.1275 1350 0.0002 -
0.1323 1400 0.0001 0.1356
0.1370 1450 0.0002 -
0.1417 1500 0.0003 0.1493
0.1464 1550 0.0003 -
0.1512 1600 0.0002 0.15
0.1559 1650 0.0002 -
0.1606 1700 0.0003 0.1469
0.1653 1750 0.0001 -
0.1701 1800 0.0001 0.1554
0.1748 1850 0.0002 -
0.1795 1900 0.0001 0.168
0.1842 1950 0.0001 -
0.1889 2000 0.0004 0.1568
0.1937 2050 0.0001 -
0.1984 2100 0.0001 0.1513
0.2031 2150 0.0001 -
0.2078 2200 0.0003 0.1503
0.2126 2250 0.0002 -
0.2173 2300 0.0604 0.155
0.2220 2350 0.0001 -
0.2267 2400 0.0002 0.1739
0.2315 2450 0.0006 -
0.2362 2500 0.0002 0.1558
0.2409 2550 0.0002 -
0.2456 2600 0.0001 0.1393
0.2504 2650 0.0004 -
0.2551 2700 0.0003 0.1642
0.2598 2750 0.0002 -
0.2645 2800 0.0002 0.1776
0.2692 2850 0.0 -
0.2740 2900 0.0002 0.1794
0.2787 2950 0.0001 -
0.2834 3000 0.0001 0.183
0.2881 3050 0.0001 -
0.2929 3100 0.0001 0.1805
0.2976 3150 0.0001 -
0.3023 3200 0.0001 0.1757
0.3070 3250 0.0001 -
0.3118 3300 0.0001 0.1302
0.3165 3350 0.0001 -
0.3212 3400 0.0001 0.1348
0.3259 3450 0.0001 -
0.3307 3500 0.0005 0.1623
0.3354 3550 0.0 -
0.3401 3600 0.0 0.1286
0.3448 3650 0.0 -
0.3496 3700 0.0001 0.1736
0.3543 3750 0.0 -
0.3590 3800 0.0 0.127
0.3637 3850 0.0 -
0.3684 3900 0.0001 0.1231
0.3732 3950 0.0 -
0.3779 4000 0.0001 0.1261
0.3826 4050 0.0001 -
0.3873 4100 0.0 0.1216
0.3921 4150 0.0 -
0.3968 4200 0.0 0.1404
0.4015 4250 0.0 -
0.4062 4300 0.0 0.1466
0.4110 4350 0.0 -
0.4157 4400 0.0 0.1482
0.4204 4450 0.0 -
0.4251 4500 0.0 0.1547
0.4299 4550 0.0 -
0.4346 4600 0.0 0.1566
0.4393 4650 0.0 -
0.4440 4700 0.0 0.1684
0.4487 4750 0.0 -
0.4535 4800 0.0 0.1746
0.4582 4850 0.0 -
0.4629 4900 0.0 0.167
0.4676 4950 0.0 -
0.4724 5000 0.0001 0.1683
0.4771 5050 0.0 -
0.4818 5100 0.0 0.1693
0.4865 5150 0.0 -
0.4913 5200 0.0 0.1694
0.4960 5250 0.0 -
0.5007 5300 0.0 0.162
0.5054 5350 0.0 -
0.5102 5400 0.0 0.1388
0.5149 5450 0.0 -
0.5196 5500 0.0 0.1353
0.5243 5550 0.0 -
0.5291 5600 0.0 0.1401
0.5338 5650 0.0 -
0.5385 5700 0.0 0.1466
0.5432 5750 0.0 -
0.5479 5800 0.0 0.1529
0.5527 5850 0.0 -
0.5574 5900 0.0 0.1488
0.5621 5950 0.0 -
0.5668 6000 0.0 0.147
0.5716 6050 0.0 -
0.5763 6100 0.0 0.1493
0.5810 6150 0.0 -
0.5857 6200 0.0 0.1525
0.5905 6250 0.0 -
0.5952 6300 0.0 0.1505
0.5999 6350 0.0 -
0.6046 6400 0.0 0.1554
0.6094 6450 0.0 -
0.6141 6500 0.0 0.1546
0.6188 6550 0.0 -
0.6235 6600 0.0 0.1598
0.6282 6650 0.0 -
0.6330 6700 0.0 0.179
0.6377 6750 0.0 -
0.6424 6800 0.0 0.1719
0.6471 6850 0.0001 -
0.6519 6900 0.0 0.1812
0.6566 6950 0.0 -
0.6613 7000 0.0 0.1648
0.6660 7050 0.0 -
0.6708 7100 0.0 0.1717
0.6755 7150 0.0 -
0.6802 7200 0.0 0.1793
0.6849 7250 0.0 -
0.6897 7300 0.0 0.1766
0.6944 7350 0.0 -
0.6991 7400 0.0 0.177
0.7038 7450 0.0 -
0.7085 7500 0.0 0.1749
0.7133 7550 0.0 -
0.7180 7600 0.0 0.1814
0.7227 7650 0.0 -
0.7274 7700 0.0 0.1742
0.7322 7750 0.0 -
0.7369 7800 0.0 0.179
0.7416 7850 0.0 -
0.7463 7900 0.0 0.1767
0.7511 7950 0.0 -
0.7558 8000 0.0 0.1809
0.7605 8050 0.0 -
0.7652 8100 0.0 0.1767
0.7700 8150 0.0 -
0.7747 8200 0.0 0.1698
0.7794 8250 0.0 -
0.7841 8300 0.0 0.1772
0.7889 8350 0.0 -
0.7936 8400 0.0 0.1722
0.7983 8450 0.0 -
0.8030 8500 0.0 0.1671
0.8077 8550 0.0 -
0.8125 8600 0.0 0.181
0.8172 8650 0.0 -
0.8219 8700 0.0 0.1788
0.8266 8750 0.0 -
0.8314 8800 0.0 0.1784
0.8361 8850 0.0 -
0.8408 8900 0.0 0.1806
0.8455 8950 0.0 -
0.8503 9000 0.0 0.1783
0.8550 9050 0.0 -
0.8597 9100 0.0 0.1783
0.8644 9150 0.0 -
0.8692 9200 0.0 0.1785
0.8739 9250 0.0 -
0.8786 9300 0.0 0.1772
0.8833 9350 0.0 -
0.8880 9400 0.0 0.1816
0.8928 9450 0.0 -
0.8975 9500 0.0 0.1794
0.9022 9550 0.0 -
0.9069 9600 0.0 0.168
0.9117 9650 0.0 -
0.9164 9700 0.0 0.1771
0.9211 9750 0.0 -
0.9258 9800 0.0 0.1675
0.9306 9850 0.0 -
0.9353 9900 0.0 0.1746
0.9400 9950 0.0 -
0.9447 10000 0.0 0.1769
0.9495 10050 0.0 -
0.9542 10100 0.0 0.177
0.9589 10150 0.0 -
0.9636 10200 0.0 0.1771
0.9684 10250 0.0 -
0.9731 10300 0.0 0.1794
0.9778 10350 0.0 -
0.9825 10400 0.0 0.177
0.9872 10450 0.0 -
0.9920 10500 0.0 0.1794
0.9967 10550 0.0 -
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.10.11
  • SetFit: 1.0.3
  • Sentence Transformers: 2.7.0
  • Transformers: 4.39.0
  • PyTorch: 2.3.1
  • Datasets: 2.20.0
  • Tokenizers: 0.15.2

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}
Downloads last month
7
Safetensors
Model size
118M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for thegenerativegeneration/stay_or_go_conversation_classifier_s

Finetuned
(56)
this model