metadata

library_name: setfit
tags:
  - setfit
  - sentence-transformers
  - text-classification
  - generated_from_setfit_trainer
base_model: sentence-transformers/all-MiniLM-L12-v2
metrics:
  - accuracy
widget:
  - text: Could you provide the average temperature, annual rainfall in Paris?
  - text: >-
      Can you provide a summary of the key points discussed about urban
      development?
  - text: Compare ces deux documents
  - text: What are the steps required to apply for a passport?
  - text: What is the basic definition of seismic design?
pipeline_tag: text-classification
inference: true
model-index:
  - name: SetFit with sentence-transformers/all-MiniLM-L12-v2
    results:
      - task:
          type: text-classification
          name: Text Classification
        dataset:
          name: Unknown
          type: unknown
          split: test
        metrics:
          - type: accuracy
            value: 0.7333333333333333
            name: Accuracy

SetFit with sentence-transformers/all-MiniLM-L12-v2

This is a SetFit model that can be used for Text Classification. This SetFit model uses sentence-transformers/all-MiniLM-L12-v2 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

Fine-tuning a Sentence Transformer with contrastive learning.
Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Type: SetFit
Sentence Transformer body: sentence-transformers/all-MiniLM-L12-v2
Classification head: a LogisticRegression instance
Maximum Sequence Length: 128 tokens
Number of Classes: 5 classes

Model Sources

Repository: SetFit on GitHub
Paper: Efficient Few-Shot Learning Without Prompts
Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts

Model Labels

Label	Examples
sub_queries	'How can I use 3D print to build a bridge and how much would it be?' 'Pourriez-vous détailler les critères spécifiques utilisés pour évaluer la durabilité des matériaux de construction, les types de systèmes HVAC les plus efficaces actuellement en usage dans les bâtiments verts, et les différentes méthodes employées pour réduire les déchets pendant la phase de construction ?' 'Comment faire une etude de marche? Quelles sont les meilleures sources?'
summary	'Quelles informations primordiales me conseillez-vous de mémoriser de ce document' 'Quels sont les points principaux à retenir' 'What is the primary theme of the document ?'
exchange	'Pourriez-vous me fournir un résumé des points clés abordés dans notre discussion précédente ?' 'Quels sont les points clés abordés dans notre discussion précédente ?' 'Could you restate the main points discussed about acoustic engineering?'
simple_questions	'Quelle est le principal moteur de la croissance économique ? Fais un post linkedin sur le sujet' 'Pourriez-vous résumer les bénéfices que les utilisateurs peuvent tirer des récentes avancées en matériel informatique ?' 'What is the purpose of environmental impact assessments?'
compare	'Compare the methodologies' 'Compare the nutritional information provided on these food labels' 'Analysez comment la structure narrative de ces manuscrits influence leur message'

Evaluation

Metrics

Label	Accuracy
all	0.7333

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("egis-group/router_mini_lm_l6")
# Run inference
preds = model("Compare ces deux documents")

Training Details

Training Set Metrics

Training set	Min	Median	Max
Word count	3	13.4636	48

Label	Training Sample Count
negative	0
positive	0

Training Hyperparameters

batch_size: (16, 16)
num_epochs: (4, 4)
max_steps: -1
sampling_strategy: oversampling
body_learning_rate: (2e-05, 1e-05)
head_learning_rate: 0.01
loss: CosineSimilarityLoss
distance_metric: cosine_distance
margin: 0.25
end_to_end: False
use_amp: False
warmup_proportion: 0.1
seed: 42
eval_max_steps: -1
load_best_model_at_end: True

Training Results

Epoch	Step	Training Loss	Validation Loss
0.0003	1	0.3239	-
0.0152	50	0.3443	-
0.0304	100	0.2282	-
0.0456	150	0.2576	-
0.0608	200	0.2587	-
0.0760	250	0.1747	-
0.0912	300	0.1916	-
0.1064	350	0.1638	-
0.1216	400	0.1459	-
0.1368	450	0.1322	-
0.1520	500	0.038	-
0.1672	550	0.0636	-
0.1824	600	0.0613	-
0.1976	650	0.0322	-
0.2128	700	0.0159	-
0.2280	750	0.0029	-
0.2432	800	0.0012	-
0.2584	850	0.0019	-
0.2736	900	0.0025	-
0.2888	950	0.0028	-
0.3040	1000	0.001	-
0.3192	1050	0.0014	-
0.3344	1100	0.0007	-
0.3497	1150	0.001	-
0.3649	1200	0.0014	-
0.3801	1250	0.0003	-
0.3953	1300	0.0005	-
0.4105	1350	0.0003	-
0.4257	1400	0.0004	-
0.4409	1450	0.0003	-
0.4561	1500	0.0004	-
0.4713	1550	0.0003	-
0.4865	1600	0.0002	-
0.5017	1650	0.0004	-
0.5169	1700	0.0003	-
0.5321	1750	0.0003	-
0.5473	1800	0.0004	-
0.5625	1850	0.0002	-
0.5777	1900	0.0001	-
0.5929	1950	0.0001	-
0.6081	2000	0.0003	-
0.6233	2050	0.0002	-
0.6385	2100	0.0001	-
0.6537	2150	0.0002	-
0.6689	2200	0.0002	-
0.6841	2250	0.0001	-
0.6993	2300	0.0002	-
0.7145	2350	0.0003	-
0.7297	2400	0.0002	-
0.7449	2450	0.0002	-
0.7601	2500	0.0001	-
0.7753	2550	0.0002	-
0.7905	2600	0.0001	-
0.8057	2650	0.0001	-
0.8209	2700	0.0001	-
0.8361	2750	0.0001	-
0.8513	2800	0.0001	-
0.8665	2850	0.0001	-
0.8817	2900	0.0001	-
0.8969	2950	0.0001	-
0.9121	3000	0.0001	-
0.9273	3050	0.0001	-
0.9425	3100	0.0001	-
0.9577	3150	0.0001	-
0.9729	3200	0.0001	-
0.9881	3250	0.0001	-
1.0	3289	-	0.0982
1.0033	3300	0.0001	-
1.0185	3350	0.0001	-
1.0337	3400	0.0001	-
1.0490	3450	0.0001	-
1.0642	3500	0.0001	-
1.0794	3550	0.0249	-
1.0946	3600	0.0002	-
1.1098	3650	0.0001	-
1.1250	3700	0.0001	-
1.1402	3750	0.0001	-
1.1554	3800	0.0001	-
1.1706	3850	0.0001	-
1.1858	3900	0.0001	-
1.2010	3950	0.0001	-
1.2162	4000	0.0001	-
1.2314	4050	0.0	-
1.2466	4100	0.0001	-
1.2618	4150	0.0	-
1.2770	4200	0.0001	-
1.2922	4250	0.0	-
1.3074	4300	0.0001	-
1.3226	4350	0.0001	-
1.3378	4400	0.0001	-
1.3530	4450	0.0001	-
1.3682	4500	0.0001	-
1.3834	4550	0.0001	-
1.3986	4600	0.0001	-
1.4138	4650	0.0001	-
1.4290	4700	0.0001	-
1.4442	4750	0.0001	-
1.4594	4800	0.0001	-
1.4746	4850	0.0001	-
1.4898	4900	0.0	-
1.5050	4950	0.0	-
1.5202	5000	0.0	-
1.5354	5050	0.0	-
1.5506	5100	0.0	-
1.5658	5150	0.0001	-
1.5810	5200	0.0001	-
1.5962	5250	0.0	-
1.6114	5300	0.0	-
1.6266	5350	0.0001	-
1.6418	5400	0.0001	-
1.6570	5450	0.0	-
1.6722	5500	0.0001	-
1.6874	5550	0.0	-
1.7026	5600	0.0001	-
1.7178	5650	0.0	-
1.7330	5700	0.0001	-
1.7483	5750	0.0001	-
1.7635	5800	0.0001	-
1.7787	5850	0.0001	-
1.7939	5900	0.0	-
1.8091	5950	0.0001	-
1.8243	6000	0.0001	-
1.8395	6050	0.0	-
1.8547	6100	0.0001	-
1.8699	6150	0.0	-
1.8851	6200	0.0	-
1.9003	6250	0.0	-
1.9155	6300	0.0	-
1.9307	6350	0.0001	-
1.9459	6400	0.0	-
1.9611	6450	0.0	-
1.9763	6500	0.0001	-
1.9915	6550	0.0	-
2.0	6578	-	0.0939
2.0067	6600	0.0001	-
2.0219	6650	0.0001	-
2.0371	6700	0.0001	-
2.0523	6750	0.0001	-
2.0675	6800	0.0	-
2.0827	6850	0.0	-
2.0979	6900	0.0	-
2.1131	6950	0.0	-
2.1283	7000	0.0001	-
2.1435	7050	0.0001	-
2.1587	7100	0.0	-
2.1739	7150	0.0	-
2.1891	7200	0.0001	-
2.2043	7250	0.0001	-
2.2195	7300	0.0	-
2.2347	7350	0.0	-
2.2499	7400	0.0	-
2.2651	7450	0.0	-
2.2803	7500	0.0	-
2.2955	7550	0.0001	-
2.3107	7600	0.0	-
2.3259	7650	0.0001	-
2.3411	7700	0.0	-
2.3563	7750	0.0001	-
2.3715	7800	0.0	-
2.3867	7850	0.0001	-
2.4019	7900	0.0	-
2.4171	7950	0.0	-
2.4324	8000	0.0	-
2.4476	8050	0.0001	-
2.4628	8100	0.0001	-
2.4780	8150	0.0	-
2.4932	8200	0.0001	-
2.5084	8250	0.0001	-
2.5236	8300	0.0001	-
2.5388	8350	0.0	-
2.5540	8400	0.0	-
2.5692	8450	0.0	-
2.5844	8500	0.0	-
2.5996	8550	0.0	-
2.6148	8600	0.0	-
2.6300	8650	0.0	-
2.6452	8700	0.0	-
2.6604	8750	0.0	-
2.6756	8800	0.0	-
2.6908	8850	0.0	-
2.7060	8900	0.0001	-
2.7212	8950	0.0	-
2.7364	9000	0.0	-
2.7516	9050	0.0001	-
2.7668	9100	0.0	-
2.7820	9150	0.0	-
2.7972	9200	0.0	-
2.8124	9250	0.0	-
2.8276	9300	0.0	-
2.8428	9350	0.0	-
2.8580	9400	0.0	-
2.8732	9450	0.0	-
2.8884	9500	0.0	-
2.9036	9550	0.0	-
2.9188	9600	0.0	-
2.9340	9650	0.0	-
2.9492	9700	0.0	-
2.9644	9750	0.0	-
2.9796	9800	0.0	-
2.9948	9850	0.0	-
3.0	9867	-	0.0951
3.0100	9900	0.0	-
3.0252	9950	0.0	-
3.0404	10000	0.0	-
3.0556	10050	0.0	-
3.0708	10100	0.0	-
3.0860	10150	0.0	-
3.1012	10200	0.0	-
3.1164	10250	0.0	-
3.1317	10300	0.0	-
3.1469	10350	0.0	-
3.1621	10400	0.0	-
3.1773	10450	0.0001	-
3.1925	10500	0.0	-
3.2077	10550	0.0	-
3.2229	10600	0.0	-
3.2381	10650	0.0	-
3.2533	10700	0.0	-
3.2685	10750	0.0	-
3.2837	10800	0.0	-
3.2989	10850	0.0	-
3.3141	10900	0.0	-
3.3293	10950	0.0	-
3.3445	11000	0.0	-
3.3597	11050	0.0	-
3.3749	11100	0.0	-
3.3901	11150	0.0	-
3.4053	11200	0.0	-
3.4205	11250	0.0	-
3.4357	11300	0.0	-
3.4509	11350	0.0	-
3.4661	11400	0.0	-
3.4813	11450	0.0	-
3.4965	11500	0.0	-
3.5117	11550	0.0	-
3.5269	11600	0.0	-
3.5421	11650	0.0	-
3.5573	11700	0.0	-
3.5725	11750	0.0	-
3.5877	11800	0.0	-
3.6029	11850	0.0	-
3.6181	11900	0.0	-
3.6333	11950	0.0	-
3.6485	12000	0.0	-
3.6637	12050	0.0	-
3.6789	12100	0.0	-
3.6941	12150	0.0	-
3.7093	12200	0.0	-
3.7245	12250	0.0	-
3.7397	12300	0.0	-
3.7549	12350	0.0	-
3.7701	12400	0.0	-
3.7853	12450	0.0	-
3.8005	12500	0.0	-
3.8157	12550	0.0	-
3.8310	12600	0.0	-
3.8462	12650	0.0	-
3.8614	12700	0.0	-
3.8766	12750	0.0	-
3.8918	12800	0.0	-
3.9070	12850	0.0	-
3.9222	12900	0.0	-
3.9374	12950	0.0	-
3.9526	13000	0.0	-
3.9678	13050	0.0	-
3.9830	13100	0.0	-
3.9982	13150	0.0	-
4.0	13156	-	0.0954

The bold row denotes the saved checkpoint.

Framework Versions

Python: 3.10.12
SetFit: 1.0.3
Sentence Transformers: 3.0.1
Transformers: 4.39.0
PyTorch: 2.3.0+cu121
Datasets: 2.19.2
Tokenizers: 0.15.2

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}