Edit model card

SetFit with intfloat/multilingual-e5-small

This is a SetFit model that can be used for Text Classification. This SetFit model uses intfloat/multilingual-e5-small as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Sources

Model Labels

Label Examples
independent
  • 'Comment rédiger un contrat de travail ?'
  • 'Quels sont les impôts et taxes applicables aux entreprises ?'
  • 'Comment peut-on contester un licenciement abusif ?'
follow_up
  • 'Quelles sont les conséquences de cette loi ?'
  • "Comment cette loi s'inscrit-elle dans le cadre plus large du droit algérien ?"
  • "Comment puis-je obtenir plus d'informations sur ce sujet ?"

Evaluation

Metrics

Label Accuracy
all 1.0

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("super-cinnamon/fewshot-followup-multi-e5")
# Run inference
preds = model("Comment se déroule une procédure de divorce ?")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 1 9.6184 16
Label Training Sample Count
independent 43
follow_up 33

Training Hyperparameters

  • batch_size: (8, 8)
  • num_epochs: (10, 10)
  • max_steps: -1
  • sampling_strategy: oversampling
  • body_learning_rate: (2e-05, 1e-05)
  • head_learning_rate: 0.01
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.25
  • end_to_end: False
  • use_amp: False
  • warmup_proportion: 0.1
  • seed: 42
  • eval_max_steps: -1
  • load_best_model_at_end: False

Training Results

Epoch Step Training Loss Validation Loss
0.0027 1 0.3915 -
0.1326 50 0.3193 -
0.2653 100 0.2252 -
0.3979 150 0.1141 -
0.5305 200 0.0197 -
0.6631 250 0.0019 -
0.7958 300 0.0021 -
0.9284 350 0.0002 -
1.0610 400 0.0008 -
1.1936 450 0.0005 -
1.3263 500 0.0002 -
1.4589 550 0.0002 -
1.5915 600 0.0007 -
1.7241 650 0.0001 -
1.8568 700 0.0003 -
1.9894 750 0.0002 -
2.1220 800 0.0001 -
2.2546 850 0.0002 -
2.3873 900 0.0 -
2.5199 950 0.0003 -
2.6525 1000 0.0001 -
2.7851 1050 0.0001 -
2.9178 1100 0.0001 -
3.0504 1150 0.0001 -
3.1830 1200 0.0001 -
3.3156 1250 0.0001 -
3.4483 1300 0.0001 -
3.5809 1350 0.0001 -
3.7135 1400 0.0 -
3.8462 1450 0.0 -
3.9788 1500 0.0 -
4.1114 1550 0.0 -
4.2440 1600 0.0001 -
4.3767 1650 0.0001 -
4.5093 1700 0.0001 -
4.6419 1750 0.0001 -
4.7745 1800 0.0 -
4.9072 1850 0.0001 -
5.0398 1900 0.0 -
5.1724 1950 0.0001 -
5.3050 2000 0.0 -
5.4377 2050 0.0001 -
5.5703 2100 0.0 -
5.7029 2150 0.0 -
5.8355 2200 0.0 -
5.9682 2250 0.0001 -
6.1008 2300 0.0001 -
6.2334 2350 0.0 -
6.3660 2400 0.0001 -
6.4987 2450 0.0 -
6.6313 2500 0.0 -
6.7639 2550 0.0 -
6.8966 2600 0.0 -
7.0292 2650 0.0 -
7.1618 2700 0.0 -
7.2944 2750 0.0 -
7.4271 2800 0.0001 -
7.5597 2850 0.0 -
7.6923 2900 0.0 -
7.8249 2950 0.0 -
7.9576 3000 0.0 -
8.0902 3050 0.0 -
8.2228 3100 0.0 -
8.3554 3150 0.0 -
8.4881 3200 0.0001 -
8.6207 3250 0.0 -
8.7533 3300 0.0 -
8.8859 3350 0.0 -
9.0186 3400 0.0001 -
9.1512 3450 0.0 -
9.2838 3500 0.0 -
9.4164 3550 0.0001 -
9.5491 3600 0.0 -
9.6817 3650 0.0001 -
9.8143 3700 0.0 -
9.9469 3750 0.0001 -

Framework Versions

  • Python: 3.10.12
  • SetFit: 1.0.1
  • Sentence Transformers: 2.2.2
  • Transformers: 4.35.2
  • PyTorch: 2.1.0+cu118
  • Datasets: 2.15.0
  • Tokenizers: 0.15.0

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}
Downloads last month
263
Safetensors
Model size
118M params
Tensor type
F32
·
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Finetuned from

Evaluation results