SetFit with sentence-transformers/all-MiniLM-L12-v2

This is a SetFit model that can be used for Text Classification. This SetFit model uses sentence-transformers/all-MiniLM-L12-v2 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Sources

Model Labels

Label Examples
sub_queries
  • 'Could you break down the main factors I should consider when researching market prices and how to effectively communicate our needs to the supplier during negotiations?'
  • 'Comment faire pousser une plante et le mesurer ?'
  • "Quel est le meilleur matériau pour l'isolation phonique et thermique?"
simple_questions
  • 'What are the key strategies for maintaining efficient communication in a remote work environment?'
  • 'Could you summarize the ways a person can help in adapting to climate change ?'
  • 'What are the current trends in construction?'
exchange
  • 'Could you please restate your last explanation using simpler terms?'
  • 'Could you restate the impact of augmented reality on design practices?'
  • 'Pourriez-vous me donner un résumé des principaux points abordés dans notre conversation précédente ?'
compare
  • 'How do the conclusions differ?'
  • 'Contrast the main arguments presented in each paper'
  • 'Quelles sont les principales différences dans les programmes éducatifs décrits dans ces documents ?'
summary
  • 'Que dois-je retenir de ce doc ?'
  • 'What are the key assertions made within the text'
  • 'What are the most important argument stated in the document?'

Evaluation

Metrics

Label Accuracy
all 0.9333

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("egis-group/router_mini_lm_l6")
# Run inference
preds = model("Compare ces deux documents")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 4 13.4389 48
Label Training Sample Count
negative 0
positive 0

Training Hyperparameters

  • batch_size: (16, 16)
  • num_epochs: (4, 4)
  • max_steps: -1
  • sampling_strategy: oversampling
  • body_learning_rate: (2e-05, 1e-05)
  • head_learning_rate: 0.01
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.25
  • end_to_end: False
  • use_amp: False
  • warmup_proportion: 0.1
  • seed: 42
  • eval_max_steps: -1
  • load_best_model_at_end: True

Training Results

Epoch Step Training Loss Validation Loss
0.0003 1 0.4073 -
0.0151 50 0.3054 -
0.0303 100 0.2066 -
0.0454 150 0.2664 -
0.0606 200 0.2463 -
0.0757 250 0.214 -
0.0909 300 0.1892 -
0.1060 350 0.1402 -
0.1212 400 0.1804 -
0.1363 450 0.0571 -
0.1515 500 0.0979 -
0.1666 550 0.1775 -
0.1818 600 0.0377 -
0.1969 650 0.0398 -
0.2121 700 0.0423 -
0.2272 750 0.0036 -
0.2424 800 0.0079 -
0.2575 850 0.0049 -
0.2726 900 0.0018 -
0.2878 950 0.0018 -
0.3029 1000 0.0032 -
0.3181 1050 0.0019 -
0.3332 1100 0.0008 -
0.3484 1150 0.0006 -
0.3635 1200 0.0006 -
0.3787 1250 0.0011 -
0.3938 1300 0.0005 -
0.4090 1350 0.001 -
0.4241 1400 0.0009 -
0.4393 1450 0.0004 -
0.4544 1500 0.0003 -
0.4696 1550 0.0003 -
0.4847 1600 0.0006 -
0.4998 1650 0.0003 -
0.5150 1700 0.0002 -
0.5301 1750 0.0002 -
0.5453 1800 0.0005 -
0.5604 1850 0.0003 -
0.5756 1900 0.0002 -
0.5907 1950 0.0002 -
0.6059 2000 0.0001 -
0.6210 2050 0.0002 -
0.6362 2100 0.0002 -
0.6513 2150 0.0001 -
0.6665 2200 0.0002 -
0.6816 2250 0.0002 -
0.6968 2300 0.0002 -
0.7119 2350 0.0002 -
0.7271 2400 0.0002 -
0.7422 2450 0.0002 -
0.7573 2500 0.0001 -
0.7725 2550 0.0001 -
0.7876 2600 0.0002 -
0.8028 2650 0.0001 -
0.8179 2700 0.0002 -
0.8331 2750 0.0007 -
0.8482 2800 0.0001 -
0.8634 2850 0.0001 -
0.8785 2900 0.0001 -
0.8937 2950 0.0001 -
0.9088 3000 0.0001 -
0.9240 3050 0.0002 -
0.9391 3100 0.0001 -
0.9543 3150 0.0001 -
0.9694 3200 0.0001 -
0.9846 3250 0.0001 -
0.9997 3300 0.0002 -
1.0 3301 - 0.0001
1.0148 3350 0.0003 -
1.0300 3400 0.0002 -
1.0451 3450 0.0001 -
1.0603 3500 0.0001 -
1.0754 3550 0.0001 -
1.0906 3600 0.0001 -
1.1057 3650 0.0001 -
1.1209 3700 0.0002 -
1.1360 3750 0.0001 -
1.1512 3800 0.0001 -
1.1663 3850 0.0001 -
1.1815 3900 0.0001 -
1.1966 3950 0.001 -
1.2118 4000 0.0001 -
1.2269 4050 0.0001 -
1.2420 4100 0.0001 -
1.2572 4150 0.0001 -
1.2723 4200 0.0001 -
1.2875 4250 0.0001 -
1.3026 4300 0.0001 -
1.3178 4350 0.0 -
1.3329 4400 0.0001 -
1.3481 4450 0.0001 -
1.3632 4500 0.0001 -
1.3784 4550 0.0001 -
1.3935 4600 0.0001 -
1.4087 4650 0.0001 -
1.4238 4700 0.0001 -
1.4390 4750 0.0001 -
1.4541 4800 0.0 -
1.4693 4850 0.0 -
1.4844 4900 0.0001 -
1.4995 4950 0.0001 -
1.5147 5000 0.0001 -
1.5298 5050 0.0001 -
1.5450 5100 0.0 -
1.5601 5150 0.0001 -
1.5753 5200 0.0 -
1.5904 5250 0.0 -
1.6056 5300 0.0001 -
1.6207 5350 0.0 -
1.6359 5400 0.0001 -
1.6510 5450 0.0 -
1.6662 5500 0.0001 -
1.6813 5550 0.0001 -
1.6965 5600 0.0 -
1.7116 5650 0.0 -
1.7267 5700 0.0 -
1.7419 5750 0.0001 -
1.7570 5800 0.0001 -
1.7722 5850 0.0 -
1.7873 5900 0.0 -
1.8025 5950 0.0001 -
1.8176 6000 0.0002 -
1.8328 6050 0.0 -
1.8479 6100 0.0001 -
1.8631 6150 0.0001 -
1.8782 6200 0.0001 -
1.8934 6250 0.0 -
1.9085 6300 0.0001 -
1.9237 6350 0.0 -
1.9388 6400 0.0001 -
1.9540 6450 0.0001 -
1.9691 6500 0.0 -
1.9842 6550 0.0 -
1.9994 6600 0.0 -
2.0 6602 - 0.0
2.0145 6650 0.0 -
2.0297 6700 0.0 -
2.0448 6750 0.0 -
2.0600 6800 0.0 -
2.0751 6850 0.0 -
2.0903 6900 0.0001 -
2.1054 6950 0.0 -
2.1206 7000 0.0 -
2.1357 7050 0.0 -
2.1509 7100 0.0001 -
2.1660 7150 0.0 -
2.1812 7200 0.0 -
2.1963 7250 0.0 -
2.2115 7300 0.0 -
2.2266 7350 0.0001 -
2.2417 7400 0.0 -
2.2569 7450 0.0 -
2.2720 7500 0.0001 -
2.2872 7550 0.0001 -
2.3023 7600 0.0 -
2.3175 7650 0.0 -
2.3326 7700 0.0 -
2.3478 7750 0.0 -
2.3629 7800 0.0 -
2.3781 7850 0.0 -
2.3932 7900 0.0 -
2.4084 7950 0.0 -
2.4235 8000 0.0 -
2.4387 8050 0.0 -
2.4538 8100 0.0001 -
2.4689 8150 0.0 -
2.4841 8200 0.0001 -
2.4992 8250 0.0 -
2.5144 8300 0.0 -
2.5295 8350 0.0001 -
2.5447 8400 0.0 -
2.5598 8450 0.0 -
2.5750 8500 0.0 -
2.5901 8550 0.0001 -
2.6053 8600 0.0001 -
2.6204 8650 0.0 -
2.6356 8700 0.0 -
2.6507 8750 0.0 -
2.6659 8800 0.0 -
2.6810 8850 0.0 -
2.6962 8900 0.0 -
2.7113 8950 0.0 -
2.7264 9000 0.0 -
2.7416 9050 0.0001 -
2.7567 9100 0.0001 -
2.7719 9150 0.0 -
2.7870 9200 0.0001 -
2.8022 9250 0.0 -
2.8173 9300 0.0 -
2.8325 9350 0.0 -
2.8476 9400 0.0 -
2.8628 9450 0.0 -
2.8779 9500 0.0 -
2.8931 9550 0.0 -
2.9082 9600 0.0 -
2.9234 9650 0.0 -
2.9385 9700 0.0 -
2.9537 9750 0.0 -
2.9688 9800 0.0 -
2.9839 9850 0.0 -
2.9991 9900 0.0 -
3.0 9903 - 0.0
3.0142 9950 0.0 -
3.0294 10000 0.0 -
3.0445 10050 0.0 -
3.0597 10100 0.0 -
3.0748 10150 0.0 -
3.0900 10200 0.0 -
3.1051 10250 0.0001 -
3.1203 10300 0.0001 -
3.1354 10350 0.0 -
3.1506 10400 0.0 -
3.1657 10450 0.0 -
3.1809 10500 0.0 -
3.1960 10550 0.0 -
3.2111 10600 0.0 -
3.2263 10650 0.0 -
3.2414 10700 0.0 -
3.2566 10750 0.0 -
3.2717 10800 0.0 -
3.2869 10850 0.0 -
3.3020 10900 0.0 -
3.3172 10950 0.0 -
3.3323 11000 0.0 -
3.3475 11050 0.0 -
3.3626 11100 0.0 -
3.3778 11150 0.0 -
3.3929 11200 0.0 -
3.4081 11250 0.0001 -
3.4232 11300 0.0 -
3.4384 11350 0.0 -
3.4535 11400 0.0 -
3.4686 11450 0.0 -
3.4838 11500 0.0 -
3.4989 11550 0.0 -
3.5141 11600 0.0 -
3.5292 11650 0.0 -
3.5444 11700 0.0 -
3.5595 11750 0.0 -
3.5747 11800 0.0 -
3.5898 11850 0.0 -
3.6050 11900 0.0 -
3.6201 11950 0.0 -
3.6353 12000 0.0 -
3.6504 12050 0.0 -
3.6656 12100 0.0001 -
3.6807 12150 0.0 -
3.6958 12200 0.0 -
3.7110 12250 0.0 -
3.7261 12300 0.0 -
3.7413 12350 0.0 -
3.7564 12400 0.0 -
3.7716 12450 0.0 -
3.7867 12500 0.0 -
3.8019 12550 0.0 -
3.8170 12600 0.0 -
3.8322 12650 0.0 -
3.8473 12700 0.0 -
3.8625 12750 0.0 -
3.8776 12800 0.0 -
3.8928 12850 0.0 -
3.9079 12900 0.0 -
3.9231 12950 0.0 -
3.9382 13000 0.0 -
3.9533 13050 0.0 -
3.9685 13100 0.0 -
3.9836 13150 0.0 -
3.9988 13200 0.0 -
4.0 13204 - 0.0
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.10.12
  • SetFit: 1.0.3
  • Sentence Transformers: 3.0.1
  • Transformers: 4.39.0
  • PyTorch: 2.3.0+cu121
  • Datasets: 2.19.2
  • Tokenizers: 0.15.2

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}
Downloads last month
19
Safetensors
Model size
33.4M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for egis-group/router_mini_lm_l6

Finetuned
(25)
this model

Evaluation results