SetFit with intfloat/e5-small-v2

This is a SetFit model that can be used for Text Classification. This SetFit model uses intfloat/e5-small-v2 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

Fine-tuning a Sentence Transformer with contrastive learning.
Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Type: SetFit
Sentence Transformer body: intfloat/e5-small-v2
Classification head: a LogisticRegression instance
Maximum Sequence Length: 512 tokens
Number of Classes: 2 classes

Model Sources

Repository: SetFit on GitHub
Paper: Efficient Few-Shot Learning Without Prompts
Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts

Model Labels

Label	Examples
0	'query: Oi Pedro, você viu o novo filme que estreou semana passada?' 'query: Também gostei muito. Quem sabe podemos assistir juntos na próxima vez.' 'query: Jeg har det godt, tak. Hvad med dig?'
1	'query: Combinado! Vamos marcar um dia. Até mais!' 'query: Måske. Skal vi tale om det senere?' 'query: Absolument. On se voit ce soir pour fêter ça. À plus tard!'

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("setfit_model_id")
# Run inference
preds = model("query: 好的，那就先这样，李先生，再见。")

Training Details

Training Set Metrics

Training set	Min	Median	Max
Word count	2	6.2674	18

Label	Training Sample Count
0	85
1	87

Training Hyperparameters

batch_size: (4, 1)
num_epochs: (1, 1)
max_steps: -1
sampling_strategy: undersampling
body_learning_rate: (1e-06, 1e-06)
head_learning_rate: 8e-06
loss: CosineSimilarityLoss
distance_metric: cosine_distance
margin: 0.05
end_to_end: False
use_amp: False
warmup_proportion: 0.1
seed: 42
run_name: intfloat/e5-small-v2
eval_max_steps: -1
load_best_model_at_end: True

Training Results

Epoch	Step	Training Loss	Validation Loss
0.0003	1	0.3851	-
0.0135	50	0.3455	-
0.0270	100	0.3359	0.3522
0.0406	150	0.3459	-
0.0541	200	0.3645	0.3221
0.0676	250	0.3264	-
0.0811	300	0.2955	0.2759
0.0946	350	0.2546	-
0.1082	400	0.2253	0.2373
0.1217	450	0.2004	-
0.1352	500	0.3578	0.2318
0.1487	550	0.2628	-
0.1622	600	0.2614	0.2222
0.1758	650	0.2095	-
0.1893	700	0.2345	0.2196
0.2028	750	0.1842	-
0.2163	800	0.1942	0.2326
0.2299	850	0.218	-
0.2434	900	0.3134	0.2422
0.2569	950	0.1639	-
0.2704	1000	0.2138	0.23
0.2839	1050	0.3102	-
0.2975	1100	0.1347	0.2348
0.3110	1150	0.1698	-
0.3245	1200	0.2467	0.2547
0.3380	1250	0.1064	-
0.3515	1300	0.1757	0.2383
0.3651	1350	0.1093	-
0.3786	1400	0.2869	0.2393
0.3921	1450	0.2519	-
0.4056	1500	0.2344	0.2323
0.4191	1550	0.2804	-
0.4327	1600	0.1082	0.2403
0.4462	1650	0.2025	-
0.4597	1700	0.2213	0.2547
0.4732	1750	0.1302	-
0.4867	1800	0.1517	0.2345
0.5003	1850	0.2779	-
0.5138	1900	0.1918	0.2339
0.5273	1950	0.1132	-
0.5408	2000	0.2075	0.253
0.5544	2050	0.2488	-
0.5679	2100	0.0579	0.2526
0.5814	2150	0.3789	-
0.5949	2200	0.167	0.2573
0.6084	2250	0.199	-
0.6220	2300	0.0824	0.2258
0.6355	2350	0.1396	-
0.6490	2400	0.3674	0.2527
0.6625	2450	0.2448	-
0.6760	2500	0.1623	0.249
0.6896	2550	0.2198	-
0.7031	2600	0.118	0.2613
0.7166	2650	0.1511	-
0.7301	2700	0.1162	0.2351
0.7436	2750	0.1393	-
0.7572	2800	0.1845	0.2418
0.7707	2850	0.1821	-
0.7842	2900	0.1762	0.254
0.7977	2950	0.0477	-
0.8112	3000	0.1928	0.2633
0.8248	3050	0.1363	-
0.8383	3100	0.0811	0.261
0.8518	3150	0.0734	-
0.8653	3200	0.0917	0.2202
0.8789	3250	0.3027	-
0.8924	3300	0.1528	0.2767
0.9059	3350	0.2234	-
0.9194	3400	0.1048	0.2667
0.9329	3450	0.1865	-
0.9465	3500	0.051	0.2612
0.9600	3550	0.0218	-
0.9735	3600	0.1524	0.243
0.9870	3650	0.1759	-

The bold row denotes the saved checkpoint.

Framework Versions

Python: 3.10.11
SetFit: 1.0.3
Sentence Transformers: 2.7.0
Transformers: 4.39.0
PyTorch: 2.3.1
Datasets: 2.20.0
Tokenizers: 0.15.2

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}

thegenerativegeneration
/

stay_or_go_conversation_classifier_xs