SetFit Polarity Model with BAAI/bge-m3

This is a SetFit model that can be used for Aspect Based Sentiment Analysis (ABSA). This SetFit model uses BAAI/bge-m3 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification. In particular, this model is in charge of classifying aspect polarities.

The model has been trained using an efficient few-shot learning technique that involves:

Fine-tuning a Sentence Transformer with contrastive learning.
Training a classification head with features from the fine-tuned Sentence Transformer.

This model was trained within the context of a larger system for ABSA, which looks like so:

Use a spaCy model to select possible aspect span candidates.
Use a SetFit model to filter these possible aspect span candidates.
Use this SetFit model to classify the filtered aspect span candidates.

Model Details

Model Description

Model Type: SetFit
Sentence Transformer body: BAAI/bge-m3
Classification head: a LogisticRegression instance
spaCy Model: id_core_news_trf
SetFitABSA Aspect Model: firqaaa/indo-setfit-absa-bert-base-restaurants-aspect
SetFitABSA Polarity Model: firqaaa/indo-setfit-absa-bert-base-restaurants-polarity
Maximum Sequence Length: 8192 tokens
Number of Classes: 4 classes

Model Sources

Repository: SetFit on GitHub
Paper: Efficient Few-Shot Learning Without Prompts
Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts

Model Labels

Label	Examples
netral	'sangat kecil sehingga reservasi adalah suatu keharusan:restoran ini sangat kecil sehingga reservasi adalah suatu keharusan.' 'di dekat seorang busboy dan mendesiskan rapido:di sebelah kanan saya, nyo rumah berdiri di dekat seorang busboy dan mendesiskan rapido, rapido ketika dia mencoba membersihkan dan mengatur ulang meja untuk enam orang.' 'dan mengatur ulang meja untuk enam orang:di sebelah kanan saya, nyo rumah berdiri di dekat seorang busboy dan mendesiskan rapido, rapido ketika dia mencoba membersihkan dan mengatur ulang meja untuk enam orang.'
negatif	'untuk enam orang nyonya rumah:di sebelah kanan saya, nyo rumah berdiri di dekat seorang busboy dan mendesiskan rapido, rapido ketika dia mencoba membersihkan dan mengatur ulang meja untuk enam orang nyonya rumah' 'setelah berurusan dengan pizza di bawah standar:setelah berurusan dengan pizza di bawah standar di seluruh lingkungan kensington - saya menemukan sedikit tonino.' 'mereka tidak mejikan bir, anda harus:perhatikan bahwa mereka tidak mejikan bir, anda harus membawa sendiri.'
positif	'saya tidak menyukai gnocchi.:saya tidak menyukai gnocchi.' 'dari makanan pembuka yang kami makan:dari makanan pembuka yang kami makan, dim sum, dan variasi makanan lain, tidak mungkin untuk mengkritik makanan tersebut.' 'kami makan, dim sum, dan variasi:dari makanan pembuka yang kami makan, dim sum, dan variasi makanan lain, tidak mungkin untuk mengkritik makanan tersebut.'
konflik	'makanan enak tapi jangan:makanan enak tapi jangan datang ke sini dengan perut kosong.' 'milik pihak rumah tagihan:namun, setiap perselisihan tentang ruu itu diimbangi oleh takaran minuman keras yang anda tuangkan sendiri yang merupakan milik pihak rumah tagihan' 'layanan meja bisa menjadi sedikit:layanan meja bisa menjadi sedikit lebih penuh perhatian tetapi sebagai seseorang yang juga bekerja di industri jasa, saya mengerti mereka sedang sibuk.'

Evaluation

Metrics

Label	Accuracy
all	0.7898

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import AbsaModel

# Download from the 🤗 Hub
model = AbsaModel.from_pretrained(
    "firqaaa/setfit-indo-absa-restaurants-aspect",
    "firqaaa/setfit-indo-absa-restaurants-polarity",
)
# Run inference
preds = model("The food was great, but the venue is just way too busy.")

Training Details

Training Set Metrics

Training set	Min	Median	Max
Word count	3	20.6594	62

Label	Training Sample Count
konflik	34
negatif	323
netral	258
positif	853

Training Hyperparameters

batch_size: (16, 16)
num_epochs: (1, 1)
max_steps: -1
sampling_strategy: oversampling
body_learning_rate: (2e-05, 1e-05)
head_learning_rate: 0.01
loss: CosineSimilarityLoss
distance_metric: cosine_distance
margin: 0.25
end_to_end: False
use_amp: True
warmup_proportion: 0.1
seed: 42
eval_max_steps: -1
load_best_model_at_end: True

Training Results

Epoch	Step	Training Loss	Validation Loss
0.0000	1	0.2345	-
0.0006	50	0.2337	-
0.0013	100	0.267	-
0.0019	150	0.2335	-
0.0025	200	0.2368	-
0.0032	250	0.2199	-
0.0038	300	0.2325	-
0.0045	350	0.2071	-
0.0051	400	0.2229	-
0.0057	450	0.1153	-
0.0064	500	0.1771	0.1846
0.0070	550	0.1612	-
0.0076	600	0.1487	-
0.0083	650	0.147	-
0.0089	700	0.1982	-
0.0096	750	0.1579	-
0.0102	800	0.1148	-
0.0108	850	0.1008	-
0.0115	900	0.2035	-
0.0121	950	0.1348	-
0.0127	1000	0.0974	0.182
0.0134	1050	0.121	-
0.0140	1100	0.1949	-
0.0147	1150	0.2424	-
0.0153	1200	0.0601	-
0.0159	1250	0.0968	-
0.0166	1300	0.0137	-
0.0172	1350	0.034	-
0.0178	1400	0.1217	-
0.0185	1450	0.0454	-
0.0191	1500	0.0397	0.2216
0.0198	1550	0.0226	-
0.0204	1600	0.0939	-
0.0210	1650	0.0537	-
0.0217	1700	0.0566	-
0.0223	1750	0.162	-
0.0229	1800	0.0347	-
0.0236	1850	0.103	-
0.0242	1900	0.0615	-
0.0249	1950	0.0589	-
0.0255	2000	0.1668	0.2132
0.0261	2050	0.1809	-
0.0268	2100	0.0579	-
0.0274	2150	0.088	-
0.0280	2200	0.1047	-
0.0287	2250	0.1255	-
0.0293	2300	0.0312	-
0.0300	2350	0.0097	-
0.0306	2400	0.0973	-
0.0312	2450	0.0066	-
0.0319	2500	0.0589	0.2591
0.0325	2550	0.0529	-
0.0331	2600	0.0169	-
0.0338	2650	0.0455	-
0.0344	2700	0.0609	-
0.0350	2750	0.1151	-
0.0357	2800	0.0031	-
0.0363	2850	0.0546	-
0.0370	2900	0.0051	-
0.0376	2950	0.0679	-
0.0382	3000	0.0046	0.2646
0.0389	3050	0.011	-
0.0395	3100	0.0701	-
0.0401	3150	0.0011	-
0.0408	3200	0.011	-
0.0414	3250	0.0026	-
0.0421	3300	0.0027	-
0.0427	3350	0.0012	-
0.0433	3400	0.0454	-
0.0440	3450	0.0011	-
0.0446	3500	0.0012	0.2602

The bold row denotes the saved checkpoint.

Framework Versions

Python: 3.10.13
SetFit: 1.0.3
Sentence Transformers: 2.2.2
spaCy: 3.7.4
Transformers: 4.36.2
PyTorch: 2.1.2+cu121
Datasets: 2.16.1
Tokenizers: 0.15.0

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}

firqaaa
/

setfit-indo-absa-restaurant-polarity