SetFit with projecte-aina/ST-NLI-ca_paraphrase-multilingual-mpnet-base

This is a SetFit model that can be used for Text Classification. This SetFit model uses projecte-aina/ST-NLI-ca_paraphrase-multilingual-mpnet-base as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Sources

Model Labels

Label Examples
1
  • 'Bona nit, com estàs?'
  • 'Ei, què tal tot?'
  • 'Hola, com està el temps?'
0
  • 'Quin és el propòsit de la llicència administrativa?'
  • 'Quin és el benefici de les subvencions per als infants?'
  • "Què acredita el certificat d'empadronament col·lectiu?"

Evaluation

Metrics

Label Accuracy
all 0.9978

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("adriansanz/greetings-v2")
# Run inference
preds = model("Salut, tanque's")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 2 9.8187 23
Label Training Sample Count
0 100
1 60

Training Hyperparameters

  • batch_size: (16, 16)
  • num_epochs: (3, 3)
  • max_steps: -1
  • sampling_strategy: oversampling
  • body_learning_rate: (2e-05, 1e-05)
  • head_learning_rate: 0.01
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.25
  • end_to_end: False
  • use_amp: False
  • warmup_proportion: 0.1
  • l2_weight: 0.01
  • seed: 42
  • eval_max_steps: -1
  • load_best_model_at_end: False

Training Results

Epoch Step Training Loss Validation Loss
0.0012 1 0.2127 -
0.0581 50 0.1471 -
0.1163 100 0.0168 -
0.1744 150 0.001 -
0.2326 200 0.0004 -
0.2907 250 0.0002 -
0.3488 300 0.0001 -
0.4070 350 0.0001 -
0.4651 400 0.0001 -
0.5233 450 0.0001 -
0.5814 500 0.0001 -
0.6395 550 0.0001 -
0.6977 600 0.0001 -
0.7558 650 0.0 -
0.8140 700 0.0 -
0.8721 750 0.0 -
0.9302 800 0.0 -
0.9884 850 0.0 -
1.0465 900 0.0 -
1.1047 950 0.0 -
1.1628 1000 0.0 -
1.2209 1050 0.0 -
1.2791 1100 0.0 -
1.3372 1150 0.0 -
1.3953 1200 0.0 -
1.4535 1250 0.0 -
1.5116 1300 0.0 -
1.5698 1350 0.0 -
1.6279 1400 0.0 -
1.6860 1450 0.0 -
1.7442 1500 0.0 -
1.8023 1550 0.0 -
1.8605 1600 0.0 -
1.9186 1650 0.0 -
1.9767 1700 0.0 -
2.0349 1750 0.0 -
2.0930 1800 0.0 -
2.1512 1850 0.0 -
2.2093 1900 0.0 -
2.2674 1950 0.0 -
2.3256 2000 0.0 -
2.3837 2050 0.0 -
2.4419 2100 0.0 -
2.5 2150 0.0 -
2.5581 2200 0.0 -
2.6163 2250 0.0 -
2.6744 2300 0.0 -
2.7326 2350 0.0 -
2.7907 2400 0.0 -
2.8488 2450 0.0 -
2.9070 2500 0.0 -
2.9651 2550 0.0 -

Framework Versions

  • Python: 3.10.12
  • SetFit: 1.1.0
  • Sentence Transformers: 3.2.1
  • Transformers: 4.44.2
  • PyTorch: 2.5.0+cu121
  • Datasets: 3.1.0
  • Tokenizers: 0.19.1

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}
Downloads last month
11
Safetensors
Model size
278M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for adriansanz/greetings-v1

Evaluation results