SetFit with BAAI/bge-small-en-v1.5

This is a SetFit model that can be used for Text Classification. This SetFit model uses BAAI/bge-small-en-v1.5 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Sources

Model Labels

Label Examples
1
  • 'a sensitive , modest comic tragedy that works as both character study and symbolic examination of the huge economic changes sweeping modern china .'
  • 'the year 2002 has conjured up more coming-of-age stories than seem possible , but take care of my cat emerges as the very best of them .'
  • 'amy and matthew have a bit of a phony relationship , but the film works in spite of it .'
0
  • 'works on the whodunit level as its larger themes get lost in the murk of its own making'
  • "one of those strained caper movies that 's hardly any fun to watch and begins to vaporize from your memory minutes after it ends ."
  • "shunji iwai 's all about lily chou chou is a beautifully shot , but ultimately flawed film about growing up in japan ."

Evaluation

Metrics

Label Accuracy
all 0.8622

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("Jorgeutd/setfit-bge-small-v1.5-sst2-50-shot")
# Run inference
preds = model("it 's a bad sign in a thriller when you instantly know whodunit .")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 3 21.31 50
Label Training Sample Count
0 50
1 50

Training Hyperparameters

  • batch_size: (16, 16)
  • num_epochs: (10, 10)
  • max_steps: -1
  • sampling_strategy: oversampling
  • body_learning_rate: (2e-05, 1e-05)
  • head_learning_rate: 0.01
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.25
  • end_to_end: False
  • use_amp: False
  • warmup_proportion: 0.1
  • seed: 42
  • eval_max_steps: -1
  • load_best_model_at_end: False

Training Results

Epoch Step Training Loss Validation Loss
0.0031 1 0.2515 -
0.1567 50 0.2298 -
0.3135 100 0.2134 -
0.4702 150 0.0153 -
0.6270 200 0.0048 -
0.7837 250 0.0024 -
0.9404 300 0.0023 -
1.0972 350 0.0016 -
1.2539 400 0.0016 -
1.4107 450 0.001 -
1.5674 500 0.0013 -
1.7241 550 0.0008 -
1.8809 600 0.0008 -
2.0376 650 0.0007 -
2.1944 700 0.0008 -
2.3511 750 0.0008 -
2.5078 800 0.0007 -
2.6646 850 0.0006 -
2.8213 900 0.0006 -
2.9781 950 0.0005 -
3.1348 1000 0.0006 -
3.2915 1050 0.0006 -
3.4483 1100 0.0005 -
3.6050 1150 0.0005 -
3.7618 1200 0.0005 -
3.9185 1250 0.0005 -
4.0752 1300 0.0005 -
4.2320 1350 0.0004 -
4.3887 1400 0.0004 -
4.5455 1450 0.0004 -
4.7022 1500 0.0003 -
4.8589 1550 0.0006 -
5.0157 1600 0.0007 -
5.1724 1650 0.0004 -
5.3292 1700 0.0004 -
5.4859 1750 0.0004 -
5.6426 1800 0.0004 -
5.7994 1850 0.0003 -
5.9561 1900 0.0004 -
6.1129 1950 0.0003 -
6.2696 2000 0.0003 -
6.4263 2050 0.0005 -
6.5831 2100 0.0003 -
6.7398 2150 0.0003 -
6.8966 2200 0.0003 -
7.0533 2250 0.0003 -
7.2100 2300 0.0003 -
7.3668 2350 0.0003 -
7.5235 2400 0.0002 -
7.6803 2450 0.0003 -
7.8370 2500 0.0003 -
7.9937 2550 0.0003 -
8.1505 2600 0.0003 -
8.3072 2650 0.0003 -
8.4639 2700 0.0003 -
8.6207 2750 0.0003 -
8.7774 2800 0.0004 -
8.9342 2850 0.0002 -
9.0909 2900 0.0003 -
9.2476 2950 0.0004 -
9.4044 3000 0.0004 -
9.5611 3050 0.0003 -
9.7179 3100 0.0004 -
9.8746 3150 0.0003 -

Framework Versions

  • Python: 3.10.13
  • SetFit: 1.0.3
  • Sentence Transformers: 2.6.1
  • Transformers: 4.39.1
  • PyTorch: 2.1.0
  • Datasets: 2.18.0
  • Tokenizers: 0.15.2

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}
Downloads last month
28
Safetensors
Model size
33.4M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Jorgeutd/setfit-bge-small-v1.5-sst2-50-shot

Finetuned
(134)
this model

Evaluation results