Edit model card

SetFit with sentence-transformers/paraphrase-mpnet-base-v2

This is a SetFit model that can be used for Text Classification. This SetFit model uses sentence-transformers/paraphrase-mpnet-base-v2 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Sources

Model Labels

Label Examples
general_faq
  • 'What makes Banarasi silk sarees unique compared to other types of sarees, and what are their main varieties?'
  • 'How to identify mashru silk'
  • 'How can I verify the authenticity of Real Zari in a saree'
product discoverability
  • 'bakery boxes with custom designs'
  • 'What are the different fabric options available for sarees?'
  • 'show me some trending sneakers under 25k'
product faq
  • 'Is the Wmns Dunk Low Harvest Moon available in size 7?'
  • 'What type of color is the Pure Katan silk Kadhwa Bootidaar Banarasi Saree?'
  • 'What type of color is the Pure Katan Silk Pastel Orange Kadhwa Satin Tanchoi Banarasi Saree?'
product policy
  • 'What is the policy for returning a product that was part of a special sale celebration?'
  • 'Can I return an item if it was damaged during delivery preparation?'
  • 'Do you offer express shipping for sneakers?'
order tracking
  • 'I ordered the Cupcake Cases 3 days ago with order no 34567 how long will it take to deliver?'
  • 'Do you provide shipping insurance for high-value orders?'
  • 'My order has been shipped 1 day ago but still not out for delivery. Can you tell how long will it take to deliver?'

Evaluation

Metrics

Label Accuracy
all 0.9245

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("Shankhdhar/classifier_woog_hkv")
# Run inference
preds = model("cookie boxes with inserts")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 4 11.9441 24
Label Training Sample Count
general_faq 4
order tracking 28
product discoverability 40
product faq 40
product policy 31

Training Hyperparameters

  • batch_size: (16, 16)
  • num_epochs: (2, 2)
  • max_steps: -1
  • sampling_strategy: oversampling
  • body_learning_rate: (2e-05, 1e-05)
  • head_learning_rate: 0.01
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.25
  • end_to_end: False
  • use_amp: False
  • warmup_proportion: 0.1
  • seed: 42
  • eval_max_steps: -1
  • load_best_model_at_end: True

Training Results

Epoch Step Training Loss Validation Loss
0.0010 1 0.3031 -
0.0517 50 0.1396 -
0.1033 100 0.0959 -
0.1550 150 0.0036 -
0.2066 200 0.0009 -
0.2583 250 0.0008 -
0.3099 300 0.0011 -
0.3616 350 0.0005 -
0.4132 400 0.0004 -
0.4649 450 0.0003 -
0.5165 500 0.0003 -
0.5682 550 0.0003 -
0.6198 600 0.0003 -
0.6715 650 0.0001 -
0.7231 700 0.0002 -
0.7748 750 0.0001 -
0.8264 800 0.0002 -
0.8781 850 0.0002 -
0.9298 900 0.0001 -
0.0010 1 0.0002 -
0.0517 50 0.0002 -
0.1033 100 0.0007 -
0.1550 150 0.0001 -
0.2066 200 0.0002 -
0.2583 250 0.0002 -
0.3099 300 0.0001 -
0.3616 350 0.0502 -
0.4132 400 0.0001 -
0.4649 450 0.0001 -
0.5165 500 0.0001 -
0.5682 550 0.0001 -
0.6198 600 0.0 -
0.6715 650 0.0 -
0.7231 700 0.0001 -
0.7748 750 0.0 -
0.8264 800 0.0001 -
0.8781 850 0.0001 -
0.9298 900 0.0001 -
0.9814 950 0.0001 -
1.0331 1000 0.0001 -
1.0847 1050 0.0001 -
1.1364 1100 0.0 -
1.1880 1150 0.0 -
1.2397 1200 0.0 -
1.2913 1250 0.0 -
1.3430 1300 0.0001 -
1.3946 1350 0.0 -
1.4463 1400 0.0 -
1.4979 1450 0.0 -
1.5496 1500 0.0 -
1.6012 1550 0.0 -
1.6529 1600 0.0 -
1.7045 1650 0.0 -
1.7562 1700 0.0001 -
1.8079 1750 0.0 -
1.8595 1800 0.0 -
1.9112 1850 0.0 -
1.9628 1900 0.0 -
0.0010 1 0.0 -
0.0517 50 0.0 -
0.1033 100 0.0001 -
0.1550 150 0.0 -
0.2066 200 0.0001 -
0.2583 250 0.0001 -
0.3099 300 0.0 -
0.3616 350 0.0402 -
0.4132 400 0.0001 -
0.4649 450 0.0 -
0.5165 500 0.0 -
0.5682 550 0.0 -
0.6198 600 0.0 -
0.6715 650 0.0 -
0.7231 700 0.0 -
0.7748 750 0.0 -
0.8264 800 0.0 -
0.8781 850 0.0 -
0.9298 900 0.0 -
0.9814 950 0.0 -
1.0331 1000 0.0 -
1.0847 1050 0.0 -
1.1364 1100 0.0 -
1.1880 1150 0.0 -
1.2397 1200 0.0 -
1.2913 1250 0.0 -
1.3430 1300 0.0 -
1.3946 1350 0.0 -
1.4463 1400 0.0 -
1.4979 1450 0.0 -
1.5496 1500 0.0 -
1.6012 1550 0.0 -
1.6529 1600 0.0 -
1.7045 1650 0.0 -
1.7562 1700 0.0 -
1.8079 1750 0.0 -
1.8595 1800 0.0 -
1.9112 1850 0.0 -
1.9628 1900 0.0 -

Framework Versions

  • Python: 3.10.12
  • SetFit: 1.0.3
  • Sentence Transformers: 3.0.1
  • Transformers: 4.39.0
  • PyTorch: 2.2.2+cu121
  • Datasets: 2.20.0
  • Tokenizers: 0.15.2

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}
Downloads last month
57
Safetensors
Model size
109M params
Tensor type
F32
·

Finetuned from

Evaluation results