SetFit with BAAI/bge-small-en-v1.5

This is a SetFit model that can be used for Text Classification. This SetFit model uses BAAI/bge-small-en-v1.5 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

Fine-tuning a Sentence Transformer with contrastive learning.
Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Type: SetFit
Sentence Transformer body: BAAI/bge-small-en-v1.5
Classification head: a LogisticRegression instance
Maximum Sequence Length: 512 tokens
Number of Classes: 2 classes

Model Sources

Repository: SetFit on GitHub
Paper: Efficient Few-Shot Learning Without Prompts
Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts

Model Labels

Label	Examples
1	'a sensitive , modest comic tragedy that works as both character study and symbolic examination of the huge economic changes sweeping modern china .' 'the year 2002 has conjured up more coming-of-age stories than seem possible , but take care of my cat emerges as the very best of them .' 'amy and matthew have a bit of a phony relationship , but the film works in spite of it .'
0	'works on the whodunit level as its larger themes get lost in the murk of its own making' "one of those strained caper movies that 's hardly any fun to watch and begins to vaporize from your memory minutes after it ends ." "shunji iwai 's all about lily chou chou is a beautifully shot , but ultimately flawed film about growing up in japan ."

Evaluation

Metrics

Label	Accuracy
all	0.8622

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("Jorgeutd/setfit-bge-small-v1.5-sst2-50-shot")
# Run inference
preds = model("it 's a bad sign in a thriller when you instantly know whodunit .")

Training Details

Training Set Metrics

Training set	Min	Median	Max
Word count	3	21.31	50

Label	Training Sample Count
0	50
1	50

Training Hyperparameters

batch_size: (16, 16)
num_epochs: (10, 10)
max_steps: -1
sampling_strategy: oversampling
body_learning_rate: (2e-05, 1e-05)
head_learning_rate: 0.01
loss: CosineSimilarityLoss
distance_metric: cosine_distance
margin: 0.25
end_to_end: False
use_amp: False
warmup_proportion: 0.1
seed: 42
eval_max_steps: -1
load_best_model_at_end: False

Training Results

Epoch	Step	Training Loss	Validation Loss
0.0031	1	0.2515	-
0.1567	50	0.2298	-
0.3135	100	0.2134	-
0.4702	150	0.0153	-
0.6270	200	0.0048	-
0.7837	250	0.0024	-
0.9404	300	0.0023	-
1.0972	350	0.0016	-
1.2539	400	0.0016	-
1.4107	450	0.001	-
1.5674	500	0.0013	-
1.7241	550	0.0008	-
1.8809	600	0.0008	-
2.0376	650	0.0007	-
2.1944	700	0.0008	-
2.3511	750	0.0008	-
2.5078	800	0.0007	-
2.6646	850	0.0006	-
2.8213	900	0.0006	-
2.9781	950	0.0005	-
3.1348	1000	0.0006	-
3.2915	1050	0.0006	-
3.4483	1100	0.0005	-
3.6050	1150	0.0005	-
3.7618	1200	0.0005	-
3.9185	1250	0.0005	-
4.0752	1300	0.0005	-
4.2320	1350	0.0004	-
4.3887	1400	0.0004	-
4.5455	1450	0.0004	-
4.7022	1500	0.0003	-
4.8589	1550	0.0006	-
5.0157	1600	0.0007	-
5.1724	1650	0.0004	-
5.3292	1700	0.0004	-
5.4859	1750	0.0004	-
5.6426	1800	0.0004	-
5.7994	1850	0.0003	-
5.9561	1900	0.0004	-
6.1129	1950	0.0003	-
6.2696	2000	0.0003	-
6.4263	2050	0.0005	-
6.5831	2100	0.0003	-
6.7398	2150	0.0003	-
6.8966	2200	0.0003	-
7.0533	2250	0.0003	-
7.2100	2300	0.0003	-
7.3668	2350	0.0003	-
7.5235	2400	0.0002	-
7.6803	2450	0.0003	-
7.8370	2500	0.0003	-
7.9937	2550	0.0003	-
8.1505	2600	0.0003	-
8.3072	2650	0.0003	-
8.4639	2700	0.0003	-
8.6207	2750	0.0003	-
8.7774	2800	0.0004	-
8.9342	2850	0.0002	-
9.0909	2900	0.0003	-
9.2476	2950	0.0004	-
9.4044	3000	0.0004	-
9.5611	3050	0.0003	-
9.7179	3100	0.0004	-
9.8746	3150	0.0003	-

Framework Versions

Python: 3.10.13
SetFit: 1.0.3
Sentence Transformers: 2.6.1
Transformers: 4.39.1
PyTorch: 2.1.0
Datasets: 2.18.0
Tokenizers: 0.15.2

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}

Jorgeutd
/

setfit-bge-small-v1.5-sst2-50-shot