Edit model card

SetFit with sentence-transformers/all-MiniLM-L6-v2

This is a SetFit model that can be used for Text Classification. This SetFit model uses sentence-transformers/all-MiniLM-L6-v2 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Sources

Model Labels

Label Examples
microphone
  • 'Launch microphone app'
  • 'Launch recording app'
  • 'Access mic app'
history
  • 'View chat logs'
  • 'Display conversation details'
  • 'Show history'
camera
  • 'Switch to webcam mode please'
  • 'Could you switch to video camera mode?'
  • 'Open the photo webcam'

Evaluation

Metrics

Label Accuracy
all 1.0

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("porxelek/word-classification")
# Run inference
preds = model("Show recent chats")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 2 4.1364 10
Label Training Sample Count
camera 250
history 150
microphone 150

Training Hyperparameters

  • batch_size: (64, 64)
  • num_epochs: (1, 1)
  • max_steps: -1
  • sampling_strategy: oversampling
  • body_learning_rate: (2e-05, 1e-05)
  • head_learning_rate: 0.01
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.25
  • end_to_end: False
  • use_amp: False
  • warmup_proportion: 0.1
  • seed: 42
  • eval_max_steps: -1
  • load_best_model_at_end: True

Training Results

Epoch Step Training Loss Validation Loss
0.0003 1 0.1209 -
0.0164 50 0.1449 -
0.0328 100 0.046 -
0.0492 150 0.0099 -
0.0656 200 0.0049 -
0.0820 250 0.0036 -
0.0985 300 0.0022 -
0.1149 350 0.0015 -
0.1313 400 0.0011 -
0.1477 450 0.001 -
0.1641 500 0.0009 -
0.1805 550 0.0009 -
0.1969 600 0.0009 -
0.2133 650 0.0008 -
0.2297 700 0.0007 -
0.2461 750 0.0006 -
0.2626 800 0.0006 -
0.2790 850 0.0006 -
0.2954 900 0.0006 -
0.3118 950 0.0005 -
0.3282 1000 0.0004 -
0.3446 1050 0.0005 -
0.3610 1100 0.0005 -
0.3774 1150 0.0004 -
0.3938 1200 0.0004 -
0.4102 1250 0.0004 -
0.4266 1300 0.0005 -
0.4431 1350 0.0004 -
0.4595 1400 0.0003 -
0.4759 1450 0.0003 -
0.4923 1500 0.0003 -
0.5087 1550 0.0003 -
0.5251 1600 0.0003 -
0.5415 1650 0.0003 -
0.5579 1700 0.0003 -
0.5743 1750 0.0003 -
0.5907 1800 0.0003 -
0.6072 1850 0.0002 -
0.6236 1900 0.0003 -
0.6400 1950 0.0002 -
0.6564 2000 0.0002 -
0.6728 2050 0.0002 -
0.6892 2100 0.0003 -
0.7056 2150 0.0002 -
0.7220 2200 0.0002 -
0.7384 2250 0.0002 -
0.7548 2300 0.0002 -
0.7713 2350 0.0002 -
0.7877 2400 0.0002 -
0.8041 2450 0.0002 -
0.8205 2500 0.0002 -
0.8369 2550 0.0002 -
0.8533 2600 0.0002 -
0.8697 2650 0.0002 -
0.8861 2700 0.0002 -
0.9025 2750 0.0002 -
0.9189 2800 0.0002 -
0.9353 2850 0.0002 -
0.9518 2900 0.0002 -
0.9682 2950 0.0002 -
0.9846 3000 0.0002 -
1.0 3047 - 0.0
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.10.12
  • SetFit: 1.0.3
  • Sentence Transformers: 3.0.1
  • Transformers: 4.39.0
  • PyTorch: 2.3.1+cu121
  • Datasets: 2.20.0
  • Tokenizers: 0.15.2

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}
Downloads last month
30
Safetensors
Model size
22.7M params
Tensor type
F32
·
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Finetuned from

Evaluation results