SetFit with BAAI/bge-small-en-v1.5

This is a SetFit model that can be used for Text Classification. This SetFit model uses BAAI/bge-small-en-v1.5 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

Fine-tuning a Sentence Transformer with contrastive learning.
Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Type: SetFit
Sentence Transformer body: BAAI/bge-small-en-v1.5
Classification head: a LogisticRegression instance
Maximum Sequence Length: 512 tokens
Number of Classes: 4 classes

Model Sources

Repository: SetFit on GitHub
Paper: Efficient Few-Shot Learning Without Prompts
Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts

Model Labels

Label	Examples
wrapup_question	'"That's the full breakdown of my solution. I'm happy to discuss any part of it in more detail if needed."' 'I hope that was helpful, is there anything else you want me to touch on?' 'Is there anything else you want me to touch on before we move on?'
none	"Next I want to go on and talk about, you know, the various user segments associated with this and prioritize who we'd want to focus on and what we should be building for. by doing this we'll kind of be able to identify what are those people problems that we need. And then lastly come up with a couple of solutions and prioritize. Does all that sound" 'I must proceed with answering this question.' "One of the biggest threats to Salesforce's business is the emergence of new competitors in the market. With the rise of cloud-based CRM solutions, there are now many alternatives to Salesforce that offer similar features and functionality. Additionally, there is a growing trend towards open-source software, which could potentially disrupt the entire CRM industry. To stay ahead of these threats, Salesforce will need to continue innovating and differentiating themselves from their competitors, while also keeping a close eye on emerging technologies and market trends."
end_question	"I've given all the details I can for this question" 'That should be sufficient to answer your question' "That's everything I have to say about this question"
next_question	"I'm ready to hear what else you have to ask. What's the next topic?" "I've given that question a lot of thought. What's next?" "I hope I answered your question to your satisfaction. What's the next one?"

Evaluation

Metrics

Label	Accuracy
all	0.9322

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("nksk/Intent_bge-small-en-v1.5_v2.0")
# Run inference
preds = model("60 seconds")

Training Details

Training Set Metrics

Training set	Min	Median	Max
Word count	3	36.5665	506

Label	Training Sample Count
end_question	34
next_question	25
none	135
wrapup_question	39

Training Hyperparameters

batch_size: (32, 16)
num_epochs: (3, 10)
max_steps: -1
sampling_strategy: oversampling
body_learning_rate: (2e-05, 1e-05)
head_learning_rate: 0.0005
loss: CosineSimilarityLoss
distance_metric: cosine_distance
margin: 0.25
end_to_end: True
use_amp: True
warmup_proportion: 0.1
l2_weight: 0.01
seed: 42
eval_max_steps: -1
load_best_model_at_end: False

Training Results

Epoch	Step	Training Loss	Validation Loss
0.0010	1	0.2347	-
0.0488	50	0.2472	-
0.0977	100	0.2087	-
0.1465	150	0.1294	-
0.1953	200	0.0639	-
0.2441	250	0.0324	-
0.2930	300	0.0163	-
0.3418	350	0.0085	-
0.3906	400	0.0047	-
0.4395	450	0.0028	-
0.4883	500	0.0024	-
0.5371	550	0.0017	-
0.5859	600	0.0018	-
0.6348	650	0.0015	-
0.6836	700	0.0014	-
0.7324	750	0.0012	-
0.7812	800	0.0011	-
0.8301	850	0.0011	-
0.8789	900	0.001	-
0.9277	950	0.0009	-
0.9766	1000	0.0009	-
1.0254	1050	0.0009	-
1.0742	1100	0.0008	-
1.1230	1150	0.0008	-
1.1719	1200	0.0007	-
1.2207	1250	0.0007	-
1.2695	1300	0.0007	-
1.3184	1350	0.0007	-
1.3672	1400	0.0007	-
1.4160	1450	0.0006	-
1.4648	1500	0.0007	-
1.5137	1550	0.0006	-
1.5625	1600	0.0006	-
1.6113	1650	0.0006	-
1.6602	1700	0.0005	-
1.7090	1750	0.0005	-
1.7578	1800	0.0005	-
1.8066	1850	0.0005	-
1.8555	1900	0.0005	-
1.9043	1950	0.0005	-
1.9531	2000	0.0005	-
2.0020	2050	0.0005	-
2.0508	2100	0.0005	-
2.0996	2150	0.0005	-
2.1484	2200	0.0005	-
2.1973	2250	0.0004	-
2.2461	2300	0.0005	-
2.2949	2350	0.0005	-
2.3438	2400	0.0004	-
2.3926	2450	0.0004	-
2.4414	2500	0.0004	-
2.4902	2550	0.0004	-
2.5391	2600	0.0004	-
2.5879	2650	0.0004	-
2.6367	2700	0.0004	-
2.6855	2750	0.0004	-
2.7344	2800	0.0004	-
2.7832	2850	0.0004	-
2.8320	2900	0.0004	-
2.8809	2950	0.0004	-
2.9297	3000	0.0004	-
2.9785	3050	0.0004	-

Framework Versions

Python: 3.10.12
SetFit: 1.1.0
Sentence Transformers: 3.0.1
Transformers: 4.44.2
PyTorch: 2.4.1+cu121
Datasets: 3.0.1
Tokenizers: 0.19.1

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}

nksk
/

Intent_bge-small-en-v1.5_v2.0