SetFit with sentence-transformers/all-MiniLM-L12-v2

This is a SetFit model that can be used for Text Classification. This SetFit model uses sentence-transformers/all-MiniLM-L12-v2 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

Fine-tuning a Sentence Transformer with contrastive learning.
Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Type: SetFit
Sentence Transformer body: sentence-transformers/all-MiniLM-L12-v2
Classification head: a LogisticRegression instance
Maximum Sequence Length: 128 tokens
Number of Classes: 5 classes

Model Sources

Repository: SetFit on GitHub
Paper: Efficient Few-Shot Learning Without Prompts
Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts

Model Labels

Label	Examples
sub_queries	'Could you break down the main factors I should consider when researching market prices and how to effectively communicate our needs to the supplier during negotiations?' 'Comment faire pousser une plante et le mesurer ?' "Quel est le meilleur matériau pour l'isolation phonique et thermique?"
simple_questions	'What are the key strategies for maintaining efficient communication in a remote work environment?' 'Could you summarize the ways a person can help in adapting to climate change ?' 'What are the current trends in construction?'
exchange	'Could you please restate your last explanation using simpler terms?' 'Could you restate the impact of augmented reality on design practices?' 'Pourriez-vous me donner un résumé des principaux points abordés dans notre conversation précédente ?'
compare	'How do the conclusions differ?' 'Contrast the main arguments presented in each paper' 'Quelles sont les principales différences dans les programmes éducatifs décrits dans ces documents ?'
summary	'Que dois-je retenir de ce doc ?' 'What are the key assertions made within the text' 'What are the most important argument stated in the document?'

Evaluation

Metrics

Label	Accuracy
all	0.9333

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("egis-group/router_mini_lm_l6")
# Run inference
preds = model("Compare ces deux documents")

Training Details

Training Set Metrics

Training set	Min	Median	Max
Word count	4	13.4389	48

Label	Training Sample Count
negative	0
positive	0

Training Hyperparameters

batch_size: (16, 16)
num_epochs: (4, 4)
max_steps: -1
sampling_strategy: oversampling
body_learning_rate: (2e-05, 1e-05)
head_learning_rate: 0.01
loss: CosineSimilarityLoss
distance_metric: cosine_distance
margin: 0.25
end_to_end: False
use_amp: False
warmup_proportion: 0.1
seed: 42
eval_max_steps: -1
load_best_model_at_end: True

Training Results

Epoch	Step	Training Loss	Validation Loss
0.0003	1	0.4073	-
0.0151	50	0.3054	-
0.0303	100	0.2066	-
0.0454	150	0.2664	-
0.0606	200	0.2463	-
0.0757	250	0.214	-
0.0909	300	0.1892	-
0.1060	350	0.1402	-
0.1212	400	0.1804	-
0.1363	450	0.0571	-
0.1515	500	0.0979	-
0.1666	550	0.1775	-
0.1818	600	0.0377	-
0.1969	650	0.0398	-
0.2121	700	0.0423	-
0.2272	750	0.0036	-
0.2424	800	0.0079	-
0.2575	850	0.0049	-
0.2726	900	0.0018	-
0.2878	950	0.0018	-
0.3029	1000	0.0032	-
0.3181	1050	0.0019	-
0.3332	1100	0.0008	-
0.3484	1150	0.0006	-
0.3635	1200	0.0006	-
0.3787	1250	0.0011	-
0.3938	1300	0.0005	-
0.4090	1350	0.001	-
0.4241	1400	0.0009	-
0.4393	1450	0.0004	-
0.4544	1500	0.0003	-
0.4696	1550	0.0003	-
0.4847	1600	0.0006	-
0.4998	1650	0.0003	-
0.5150	1700	0.0002	-
0.5301	1750	0.0002	-
0.5453	1800	0.0005	-
0.5604	1850	0.0003	-
0.5756	1900	0.0002	-
0.5907	1950	0.0002	-
0.6059	2000	0.0001	-
0.6210	2050	0.0002	-
0.6362	2100	0.0002	-
0.6513	2150	0.0001	-
0.6665	2200	0.0002	-
0.6816	2250	0.0002	-
0.6968	2300	0.0002	-
0.7119	2350	0.0002	-
0.7271	2400	0.0002	-
0.7422	2450	0.0002	-
0.7573	2500	0.0001	-
0.7725	2550	0.0001	-
0.7876	2600	0.0002	-
0.8028	2650	0.0001	-
0.8179	2700	0.0002	-
0.8331	2750	0.0007	-
0.8482	2800	0.0001	-
0.8634	2850	0.0001	-
0.8785	2900	0.0001	-
0.8937	2950	0.0001	-
0.9088	3000	0.0001	-
0.9240	3050	0.0002	-
0.9391	3100	0.0001	-
0.9543	3150	0.0001	-
0.9694	3200	0.0001	-
0.9846	3250	0.0001	-
0.9997	3300	0.0002	-
1.0	3301	-	0.0001
1.0148	3350	0.0003	-
1.0300	3400	0.0002	-
1.0451	3450	0.0001	-
1.0603	3500	0.0001	-
1.0754	3550	0.0001	-
1.0906	3600	0.0001	-
1.1057	3650	0.0001	-
1.1209	3700	0.0002	-
1.1360	3750	0.0001	-
1.1512	3800	0.0001	-
1.1663	3850	0.0001	-
1.1815	3900	0.0001	-
1.1966	3950	0.001	-
1.2118	4000	0.0001	-
1.2269	4050	0.0001	-
1.2420	4100	0.0001	-
1.2572	4150	0.0001	-
1.2723	4200	0.0001	-
1.2875	4250	0.0001	-
1.3026	4300	0.0001	-
1.3178	4350	0.0	-
1.3329	4400	0.0001	-
1.3481	4450	0.0001	-
1.3632	4500	0.0001	-
1.3784	4550	0.0001	-
1.3935	4600	0.0001	-
1.4087	4650	0.0001	-
1.4238	4700	0.0001	-
1.4390	4750	0.0001	-
1.4541	4800	0.0	-
1.4693	4850	0.0	-
1.4844	4900	0.0001	-
1.4995	4950	0.0001	-
1.5147	5000	0.0001	-
1.5298	5050	0.0001	-
1.5450	5100	0.0	-
1.5601	5150	0.0001	-
1.5753	5200	0.0	-
1.5904	5250	0.0	-
1.6056	5300	0.0001	-
1.6207	5350	0.0	-
1.6359	5400	0.0001	-
1.6510	5450	0.0	-
1.6662	5500	0.0001	-
1.6813	5550	0.0001	-
1.6965	5600	0.0	-
1.7116	5650	0.0	-
1.7267	5700	0.0	-
1.7419	5750	0.0001	-
1.7570	5800	0.0001	-
1.7722	5850	0.0	-
1.7873	5900	0.0	-
1.8025	5950	0.0001	-
1.8176	6000	0.0002	-
1.8328	6050	0.0	-
1.8479	6100	0.0001	-
1.8631	6150	0.0001	-
1.8782	6200	0.0001	-
1.8934	6250	0.0	-
1.9085	6300	0.0001	-
1.9237	6350	0.0	-
1.9388	6400	0.0001	-
1.9540	6450	0.0001	-
1.9691	6500	0.0	-
1.9842	6550	0.0	-
1.9994	6600	0.0	-
2.0	6602	-	0.0
2.0145	6650	0.0	-
2.0297	6700	0.0	-
2.0448	6750	0.0	-
2.0600	6800	0.0	-
2.0751	6850	0.0	-
2.0903	6900	0.0001	-
2.1054	6950	0.0	-
2.1206	7000	0.0	-
2.1357	7050	0.0	-
2.1509	7100	0.0001	-
2.1660	7150	0.0	-
2.1812	7200	0.0	-
2.1963	7250	0.0	-
2.2115	7300	0.0	-
2.2266	7350	0.0001	-
2.2417	7400	0.0	-
2.2569	7450	0.0	-
2.2720	7500	0.0001	-
2.2872	7550	0.0001	-
2.3023	7600	0.0	-
2.3175	7650	0.0	-
2.3326	7700	0.0	-
2.3478	7750	0.0	-
2.3629	7800	0.0	-
2.3781	7850	0.0	-
2.3932	7900	0.0	-
2.4084	7950	0.0	-
2.4235	8000	0.0	-
2.4387	8050	0.0	-
2.4538	8100	0.0001	-
2.4689	8150	0.0	-
2.4841	8200	0.0001	-
2.4992	8250	0.0	-
2.5144	8300	0.0	-
2.5295	8350	0.0001	-
2.5447	8400	0.0	-
2.5598	8450	0.0	-
2.5750	8500	0.0	-
2.5901	8550	0.0001	-
2.6053	8600	0.0001	-
2.6204	8650	0.0	-
2.6356	8700	0.0	-
2.6507	8750	0.0	-
2.6659	8800	0.0	-
2.6810	8850	0.0	-
2.6962	8900	0.0	-
2.7113	8950	0.0	-
2.7264	9000	0.0	-
2.7416	9050	0.0001	-
2.7567	9100	0.0001	-
2.7719	9150	0.0	-
2.7870	9200	0.0001	-
2.8022	9250	0.0	-
2.8173	9300	0.0	-
2.8325	9350	0.0	-
2.8476	9400	0.0	-
2.8628	9450	0.0	-
2.8779	9500	0.0	-
2.8931	9550	0.0	-
2.9082	9600	0.0	-
2.9234	9650	0.0	-
2.9385	9700	0.0	-
2.9537	9750	0.0	-
2.9688	9800	0.0	-
2.9839	9850	0.0	-
2.9991	9900	0.0	-
3.0	9903	-	0.0
3.0142	9950	0.0	-
3.0294	10000	0.0	-
3.0445	10050	0.0	-
3.0597	10100	0.0	-
3.0748	10150	0.0	-
3.0900	10200	0.0	-
3.1051	10250	0.0001	-
3.1203	10300	0.0001	-
3.1354	10350	0.0	-
3.1506	10400	0.0	-
3.1657	10450	0.0	-
3.1809	10500	0.0	-
3.1960	10550	0.0	-
3.2111	10600	0.0	-
3.2263	10650	0.0	-
3.2414	10700	0.0	-
3.2566	10750	0.0	-
3.2717	10800	0.0	-
3.2869	10850	0.0	-
3.3020	10900	0.0	-
3.3172	10950	0.0	-
3.3323	11000	0.0	-
3.3475	11050	0.0	-
3.3626	11100	0.0	-
3.3778	11150	0.0	-
3.3929	11200	0.0	-
3.4081	11250	0.0001	-
3.4232	11300	0.0	-
3.4384	11350	0.0	-
3.4535	11400	0.0	-
3.4686	11450	0.0	-
3.4838	11500	0.0	-
3.4989	11550	0.0	-
3.5141	11600	0.0	-
3.5292	11650	0.0	-
3.5444	11700	0.0	-
3.5595	11750	0.0	-
3.5747	11800	0.0	-
3.5898	11850	0.0	-
3.6050	11900	0.0	-
3.6201	11950	0.0	-
3.6353	12000	0.0	-
3.6504	12050	0.0	-
3.6656	12100	0.0001	-
3.6807	12150	0.0	-
3.6958	12200	0.0	-
3.7110	12250	0.0	-
3.7261	12300	0.0	-
3.7413	12350	0.0	-
3.7564	12400	0.0	-
3.7716	12450	0.0	-
3.7867	12500	0.0	-
3.8019	12550	0.0	-
3.8170	12600	0.0	-
3.8322	12650	0.0	-
3.8473	12700	0.0	-
3.8625	12750	0.0	-
3.8776	12800	0.0	-
3.8928	12850	0.0	-
3.9079	12900	0.0	-
3.9231	12950	0.0	-
3.9382	13000	0.0	-
3.9533	13050	0.0	-
3.9685	13100	0.0	-
3.9836	13150	0.0	-
3.9988	13200	0.0	-
4.0	13204	-	0.0

The bold row denotes the saved checkpoint.

Framework Versions

Python: 3.10.12
SetFit: 1.0.3
Sentence Transformers: 3.0.1
Transformers: 4.39.0
PyTorch: 2.3.0+cu121
Datasets: 2.19.2
Tokenizers: 0.15.2

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}

egis-group
/

router_mini_lm_l6