SetFit with intfloat/multilingual-e5-small

This is a SetFit model that can be used for Text Classification. This SetFit model uses intfloat/multilingual-e5-small as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

Fine-tuning a Sentence Transformer with contrastive learning.
Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Type: SetFit
Sentence Transformer body: intfloat/multilingual-e5-small
Classification head: a LogisticRegression instance
Maximum Sequence Length: 512 tokens
Number of Classes: 2 classes

Model Sources

Repository: SetFit on GitHub
Paper: Efficient Few-Shot Learning Without Prompts
Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts

Model Labels

Label	Examples
independent	'Comment rédiger un contrat de travail ?' 'Quels sont les impôts et taxes applicables aux entreprises ?' 'Comment peut-on contester un licenciement abusif ?'
follow_up	'Quelles sont les conséquences de cette loi ?' "Comment cette loi s'inscrit-elle dans le cadre plus large du droit algérien ?" "Comment puis-je obtenir plus d'informations sur ce sujet ?"

Evaluation

Metrics

Label	Accuracy
all	1.0

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("super-cinnamon/fewshot-followup-multi-e5")
# Run inference
preds = model("Comment se déroule une procédure de divorce ?")

Training Details

Training Set Metrics

Training set	Min	Median	Max
Word count	1	9.6184	16

Label	Training Sample Count
independent	43
follow_up	33

Training Hyperparameters

batch_size: (8, 8)
num_epochs: (10, 10)
max_steps: -1
sampling_strategy: oversampling
body_learning_rate: (2e-05, 1e-05)
head_learning_rate: 0.01
loss: CosineSimilarityLoss
distance_metric: cosine_distance
margin: 0.25
end_to_end: False
use_amp: False
warmup_proportion: 0.1
seed: 42
eval_max_steps: -1
load_best_model_at_end: False

Training Results

Epoch	Step	Training Loss	Validation Loss
0.0027	1	0.3915	-
0.1326	50	0.3193	-
0.2653	100	0.2252	-
0.3979	150	0.1141	-
0.5305	200	0.0197	-
0.6631	250	0.0019	-
0.7958	300	0.0021	-
0.9284	350	0.0002	-
1.0610	400	0.0008	-
1.1936	450	0.0005	-
1.3263	500	0.0002	-
1.4589	550	0.0002	-
1.5915	600	0.0007	-
1.7241	650	0.0001	-
1.8568	700	0.0003	-
1.9894	750	0.0002	-
2.1220	800	0.0001	-
2.2546	850	0.0002	-
2.3873	900	0.0	-
2.5199	950	0.0003	-
2.6525	1000	0.0001	-
2.7851	1050	0.0001	-
2.9178	1100	0.0001	-
3.0504	1150	0.0001	-
3.1830	1200	0.0001	-
3.3156	1250	0.0001	-
3.4483	1300	0.0001	-
3.5809	1350	0.0001	-
3.7135	1400	0.0	-
3.8462	1450	0.0	-
3.9788	1500	0.0	-
4.1114	1550	0.0	-
4.2440	1600	0.0001	-
4.3767	1650	0.0001	-
4.5093	1700	0.0001	-
4.6419	1750	0.0001	-
4.7745	1800	0.0	-
4.9072	1850	0.0001	-
5.0398	1900	0.0	-
5.1724	1950	0.0001	-
5.3050	2000	0.0	-
5.4377	2050	0.0001	-
5.5703	2100	0.0	-
5.7029	2150	0.0	-
5.8355	2200	0.0	-
5.9682	2250	0.0001	-
6.1008	2300	0.0001	-
6.2334	2350	0.0	-
6.3660	2400	0.0001	-
6.4987	2450	0.0	-
6.6313	2500	0.0	-
6.7639	2550	0.0	-
6.8966	2600	0.0	-
7.0292	2650	0.0	-
7.1618	2700	0.0	-
7.2944	2750	0.0	-
7.4271	2800	0.0001	-
7.5597	2850	0.0	-
7.6923	2900	0.0	-
7.8249	2950	0.0	-
7.9576	3000	0.0	-
8.0902	3050	0.0	-
8.2228	3100	0.0	-
8.3554	3150	0.0	-
8.4881	3200	0.0001	-
8.6207	3250	0.0	-
8.7533	3300	0.0	-
8.8859	3350	0.0	-
9.0186	3400	0.0001	-
9.1512	3450	0.0	-
9.2838	3500	0.0	-
9.4164	3550	0.0001	-
9.5491	3600	0.0	-
9.6817	3650	0.0001	-
9.8143	3700	0.0	-
9.9469	3750	0.0001	-

Framework Versions

Python: 3.10.12
SetFit: 1.0.1
Sentence Transformers: 2.2.2
Transformers: 4.35.2
PyTorch: 2.1.0+cu118
Datasets: 2.15.0
Tokenizers: 0.15.0

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}

super-cinnamon
/

fewshot-followup-multi-e5