SetFit with sentence-transformers/all-MiniLM-L6-v2

This is a SetFit model that can be used for Text Classification. This SetFit model uses sentence-transformers/all-MiniLM-L6-v2 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

Fine-tuning a Sentence Transformer with contrastive learning.
Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Type: SetFit
Sentence Transformer body: sentence-transformers/all-MiniLM-L6-v2
Classification head: a LogisticRegression instance
Maximum Sequence Length: 256 tokens
Number of Classes: 3 classes

Model Sources

Repository: SetFit on GitHub
Paper: Efficient Few-Shot Learning Without Prompts
Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts

Model Labels

Label	Examples
microphone	'Launch microphone app' 'Launch recording app' 'Access mic app'
history	'View chat logs' 'Display conversation details' 'Show history'
camera	'Switch to webcam mode please' 'Could you switch to video camera mode?' 'Open the photo webcam'

Evaluation

Metrics

Label	Accuracy
all	1.0

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("porxelek/word-classification")
# Run inference
preds = model("Show recent chats")

Training Details

Training Set Metrics

Training set	Min	Median	Max
Word count	2	4.1364	10

Label	Training Sample Count
camera	250
history	150
microphone	150

Training Hyperparameters

batch_size: (64, 64)
num_epochs: (1, 1)
max_steps: -1
sampling_strategy: oversampling
body_learning_rate: (2e-05, 1e-05)
head_learning_rate: 0.01
loss: CosineSimilarityLoss
distance_metric: cosine_distance
margin: 0.25
end_to_end: False
use_amp: False
warmup_proportion: 0.1
seed: 42
eval_max_steps: -1
load_best_model_at_end: True

Training Results

Epoch	Step	Training Loss	Validation Loss
0.0003	1	0.1209	-
0.0164	50	0.1449	-
0.0328	100	0.046	-
0.0492	150	0.0099	-
0.0656	200	0.0049	-
0.0820	250	0.0036	-
0.0985	300	0.0022	-
0.1149	350	0.0015	-
0.1313	400	0.0011	-
0.1477	450	0.001	-
0.1641	500	0.0009	-
0.1805	550	0.0009	-
0.1969	600	0.0009	-
0.2133	650	0.0008	-
0.2297	700	0.0007	-
0.2461	750	0.0006	-
0.2626	800	0.0006	-
0.2790	850	0.0006	-
0.2954	900	0.0006	-
0.3118	950	0.0005	-
0.3282	1000	0.0004	-
0.3446	1050	0.0005	-
0.3610	1100	0.0005	-
0.3774	1150	0.0004	-
0.3938	1200	0.0004	-
0.4102	1250	0.0004	-
0.4266	1300	0.0005	-
0.4431	1350	0.0004	-
0.4595	1400	0.0003	-
0.4759	1450	0.0003	-
0.4923	1500	0.0003	-
0.5087	1550	0.0003	-
0.5251	1600	0.0003	-
0.5415	1650	0.0003	-
0.5579	1700	0.0003	-
0.5743	1750	0.0003	-
0.5907	1800	0.0003	-
0.6072	1850	0.0002	-
0.6236	1900	0.0003	-
0.6400	1950	0.0002	-
0.6564	2000	0.0002	-
0.6728	2050	0.0002	-
0.6892	2100	0.0003	-
0.7056	2150	0.0002	-
0.7220	2200	0.0002	-
0.7384	2250	0.0002	-
0.7548	2300	0.0002	-
0.7713	2350	0.0002	-
0.7877	2400	0.0002	-
0.8041	2450	0.0002	-
0.8205	2500	0.0002	-
0.8369	2550	0.0002	-
0.8533	2600	0.0002	-
0.8697	2650	0.0002	-
0.8861	2700	0.0002	-
0.9025	2750	0.0002	-
0.9189	2800	0.0002	-
0.9353	2850	0.0002	-
0.9518	2900	0.0002	-
0.9682	2950	0.0002	-
0.9846	3000	0.0002	-
1.0	3047	-	0.0

The bold row denotes the saved checkpoint.

Framework Versions

Python: 3.10.12
SetFit: 1.0.3
Sentence Transformers: 3.0.1
Transformers: 4.39.0
PyTorch: 2.3.1+cu121
Datasets: 2.20.0
Tokenizers: 0.15.2

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}

porxelek
/

word-classification