Edit model card

SetFit Aspect Model with sentence-transformers/paraphrase-mpnet-base-v2

This is a SetFit model that can be used for Aspect Based Sentiment Analysis (ABSA). This SetFit model uses sentence-transformers/paraphrase-mpnet-base-v2 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification. In particular, this model is in charge of filtering aspect span candidates.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

This model was trained within the context of a larger system for ABSA, which looks like so:

  1. Use a spaCy model to select possible aspect span candidates.
  2. Use this SetFit model to filter these possible aspect span candidates.
  3. Use a SetFit model to classify the filtered aspect span candidates.

Model Details

Model Description

Model Sources

Model Labels

Label Examples
aspect
  • 'staff:But the staff was so horrible to us.'
  • "food:To be completely fair, the only redeeming factor was the food, which was above average, but couldn't make up for all the other deficiencies of Teodora."
  • "food:The food is uniformly exceptional, with a very capable kitchen which will proudly whip up whatever you feel like eating, whether it's on the menu or not."
no aspect
  • "factor:To be completely fair, the only redeeming factor was the food, which was above average, but couldn't make up for all the other deficiencies of Teodora."
  • "deficiencies:To be completely fair, the only redeeming factor was the food, which was above average, but couldn't make up for all the other deficiencies of Teodora."
  • "Teodora:To be completely fair, the only redeeming factor was the food, which was above average, but couldn't make up for all the other deficiencies of Teodora."

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import AbsaModel

# Download from the 🤗 Hub
model = AbsaModel.from_pretrained(
    "Davide1999/setfit-absa-model-aspect_10epochs",
    "setfit-absa-polarity",
)
# Run inference
preds = model("The food was great, but the venue is just way too busy.")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 4 17.9296 37
Label Training Sample Count
no aspect 71
aspect 128

Training Hyperparameters

  • batch_size: (32, 32)
  • num_epochs: (10, 10)
  • max_steps: -1
  • sampling_strategy: oversampling
  • body_learning_rate: (2e-05, 1e-05)
  • head_learning_rate: 0.01
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.25
  • end_to_end: False
  • use_amp: False
  • warmup_proportion: 0.1
  • seed: 42
  • eval_max_steps: -1
  • load_best_model_at_end: False

Training Results

Epoch Step Training Loss Validation Loss
0.0015 1 0.313 -
0.0740 50 0.2663 -
0.1479 100 0.2475 -
0.2219 150 0.2774 -
0.2959 200 0.1284 -
0.3698 250 0.0257 -
0.4438 300 0.003 -
0.5178 350 0.0014 -
0.5917 400 0.0009 -
0.6657 450 0.0004 -
0.7396 500 0.0003 -
0.8136 550 0.0003 -
0.8876 600 0.0003 -
0.9615 650 0.0003 -
1.0355 700 0.0002 -
1.1095 750 0.0134 -
1.1834 800 0.0001 -
1.2574 850 0.0001 -
1.3314 900 0.0001 -
1.4053 950 0.0001 -
1.4793 1000 0.0001 -
1.5533 1050 0.0001 -
1.6272 1100 0.0001 -
1.7012 1150 0.0001 -
1.7751 1200 0.0001 -
1.8491 1250 0.0001 -
1.9231 1300 0.0001 -
1.9970 1350 0.0001 -
2.0710 1400 0.0 -
2.1450 1450 0.0006 -
2.2189 1500 0.0001 -
2.2929 1550 0.0001 -
2.3669 1600 0.0 -
2.4408 1650 0.0001 -
2.5148 1700 0.0001 -
2.5888 1750 0.0 -
2.6627 1800 0.0001 -
2.7367 1850 0.0003 -
2.8107 1900 0.0 -
2.8846 1950 0.0 -
2.9586 2000 0.0 -
3.0325 2050 0.0001 -
3.1065 2100 0.0 -
3.1805 2150 0.0 -
3.2544 2200 0.0 -
3.3284 2250 0.0 -
3.4024 2300 0.0 -
3.4763 2350 0.0 -
3.5503 2400 0.0 -
3.6243 2450 0.0 -
3.6982 2500 0.0 -
3.7722 2550 0.0 -
3.8462 2600 0.0 -
3.9201 2650 0.0 -
3.9941 2700 0.0 -
4.0680 2750 0.0 -
4.1420 2800 0.0 -
4.2160 2850 0.0 -
4.2899 2900 0.0 -
4.3639 2950 0.0 -
4.4379 3000 0.0 -
4.5118 3050 0.0 -
4.5858 3100 0.0 -
4.6598 3150 0.0 -
4.7337 3200 0.0 -
4.8077 3250 0.0 -
4.8817 3300 0.0 -
4.9556 3350 0.0 -
5.0296 3400 0.0 -
5.1036 3450 0.0 -
5.1775 3500 0.0 -
5.2515 3550 0.0 -
5.3254 3600 0.0 -
5.3994 3650 0.0 -
5.4734 3700 0.0 -
5.5473 3750 0.0 -
5.6213 3800 0.0 -
5.6953 3850 0.0 -
5.7692 3900 0.0 -
5.8432 3950 0.0 -
5.9172 4000 0.0 -
5.9911 4050 0.0 -
6.0651 4100 0.0 -
6.1391 4150 0.0 -
6.2130 4200 0.0 -
6.2870 4250 0.0 -
6.3609 4300 0.0 -
6.4349 4350 0.0 -
6.5089 4400 0.0 -
6.5828 4450 0.0 -
6.6568 4500 0.0 -
6.7308 4550 0.0 -
6.8047 4600 0.0 -
6.8787 4650 0.0 -
6.9527 4700 0.0 -
7.0266 4750 0.0 -
7.1006 4800 0.0 -
7.1746 4850 0.0 -
7.2485 4900 0.0 -
7.3225 4950 0.0 -
7.3964 5000 0.0 -
7.4704 5050 0.0 -
7.5444 5100 0.0 -
7.6183 5150 0.0 -
7.6923 5200 0.0 -
7.7663 5250 0.0 -
7.8402 5300 0.0 -
7.9142 5350 0.0 -
7.9882 5400 0.0 -
8.0621 5450 0.0 -
8.1361 5500 0.0 -
8.2101 5550 0.0 -
8.2840 5600 0.0 -
8.3580 5650 0.0 -
8.4320 5700 0.0 -
8.5059 5750 0.0 -
8.5799 5800 0.0 -
8.6538 5850 0.0 -
8.7278 5900 0.0 -
8.8018 5950 0.0 -
8.8757 6000 0.0 -
8.9497 6050 0.0 -
9.0237 6100 0.0 -
9.0976 6150 0.0 -
9.1716 6200 0.0 -
9.2456 6250 0.0 -
9.3195 6300 0.0 -
9.3935 6350 0.0 -
9.4675 6400 0.0 -
9.5414 6450 0.0 -
9.6154 6500 0.0 -
9.6893 6550 0.0 -
9.7633 6600 0.0 -
9.8373 6650 0.0 -
9.9112 6700 0.0 -
9.9852 6750 0.0 -

Framework Versions

  • Python: 3.10.12
  • SetFit: 1.0.3
  • Sentence Transformers: 2.6.1
  • spaCy: 3.7.4
  • Transformers: 4.38.2
  • PyTorch: 2.2.1+cu121
  • Datasets: 2.18.0
  • Tokenizers: 0.15.2

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}
Downloads last month
1
Safetensors
Model size
109M params
Tensor type
F32
·
Inference Examples
Inference API (serverless) has been turned off for this model.

Finetuned from