SetFit with sentence-transformers/paraphrase-mpnet-base-v2

This is a SetFit model that can be used for Text Classification. This SetFit model uses sentence-transformers/paraphrase-mpnet-base-v2 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

Fine-tuning a Sentence Transformer with contrastive learning.
Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Type: SetFit
Sentence Transformer body: sentence-transformers/paraphrase-mpnet-base-v2
Classification head: a LogisticRegression instance
Maximum Sequence Length: 512 tokens
Number of Classes: 12 classes

Model Sources

Repository: SetFit on GitHub
Paper: Efficient Few-Shot Learning Without Prompts
Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts

Model Labels

Label	Examples
Politics	'The mayor announced a new initiative to improve public transportation.' 'The senator is facing criticism for her stance on the recent bill.' 'The upcoming election has sparked intense debates among the candidates.'
Health	'Regular exercise and a balanced diet are key to maintaining good health.' 'The World Health Organization has issued new guidelines on COVID-19.' 'A new study reveals the benefits of meditation for mental health.'
Finance	'The stock market saw a significant drop following the announcement.' 'Investing in real estate can be a profitable venture if done correctly.' "The company's profits have doubled since the launch of their new product."
Travel	'Visiting the Grand Canyon is a breathtaking experience.' 'The tourism industry has been severely impacted by the pandemic.' 'Backpacking through Europe is a popular choice for young travelers.'
Food	'The new restaurant in town offers a fusion of Italian and Japanese cuisine.' 'Drinking eight glasses of water a day is essential for staying hydrated.' 'Cooking classes are a fun way to learn new recipes and techniques.'
Education	'The school district is implementing a new curriculum for the upcoming year.' 'Online learning has become increasingly popular during the pandemic.' 'The university is offering scholarships for students in financial need.'
Environment	'Climate change is causing a significant rise in sea levels.' 'Recycling and composting are effective ways to reduce waste.' 'The Amazon rainforest is home to millions of unique species.'
Fashion	'The new fashion trend is all about sustainability and eco-friendly materials.' 'The annual Met Gala is a major event in the fashion world.' 'Vintage clothing has made a comeback in recent years.'
Science	"NASA's Mars Rover has made significant discoveries about the red planet." 'The Nobel Prize in Physics was awarded for breakthroughs in black hole research.' 'Genetic engineering is opening up new possibilities in medical treatment.'
Sports	'The NBA Finals are set to begin next week with the top two teams in the league.' 'Serena Williams continues to dominate the tennis world with her powerful serve.' 'The World Cup is the most prestigious tournament in international soccer.'
Technology	'Artificial intelligence is changing the way we live and work.' 'The latest iPhone has a number of exciting new features.' 'Cybersecurity is becoming increasingly important as more and more data moves online.'
Entertainment	'The new Marvel movie is breaking box office records.' 'The Grammy Awards are a celebration of the best music of the year.' 'The latest season of Game of Thrones had fans on the edge of their seats.'

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("EmeraldMP/ANLP_kaggle")
# Run inference
preds = model("Climate change is causing a significant rise in sea levels.")

Training Details

Training Set Metrics

Training set	Min	Median	Max
Word count	7	12.8073	24

Label	Training Sample Count
Education	23
Entertainment	23
Environment	23
Fashion	23
Finance	23
Food	23
Health	23
Politics	22
Science	23
Sports	23
Technology	23
Travel	23

Training Hyperparameters

batch_size: (16, 16)
num_epochs: (10, 10)
max_steps: -1
sampling_strategy: oversampling
num_iterations: 20
body_learning_rate: (2e-05, 2e-05)
head_learning_rate: 2e-05
loss: CosineSimilarityLoss
distance_metric: cosine_distance
margin: 0.25
end_to_end: False
use_amp: False
warmup_proportion: 0.1
seed: 42
eval_max_steps: -1
load_best_model_at_end: False

Training Results

Epoch	Step	Training Loss	Validation Loss
0.0015	1	0.2748	-
0.0727	50	0.2537	-
0.1453	100	0.1734	-
0.2180	150	0.1086	-
0.2907	200	0.062	-
0.3634	250	0.046	-
0.4360	300	0.017	-
0.5087	350	0.0104	-
0.5814	400	0.006	-
0.6541	450	0.0021	-
0.7267	500	0.0052	-
0.7994	550	0.0045	-
0.8721	600	0.0012	-
0.9448	650	0.0007	-
1.0174	700	0.0006	-
1.0901	750	0.0006	-
1.1628	800	0.0006	-
1.2355	850	0.0005	-
1.3081	900	0.0004	-
1.3808	950	0.0003	-
1.4535	1000	0.0004	-
1.5262	1050	0.0004	-
1.5988	1100	0.0004	-
1.6715	1150	0.0003	-
1.7442	1200	0.0002	-
1.8169	1250	0.0002	-
1.8895	1300	0.0005	-
1.9622	1350	0.0004	-
2.0349	1400	0.0002	-
2.1076	1450	0.0004	-
2.1802	1500	0.0002	-
2.2529	1550	0.0002	-
2.3256	1600	0.0004	-
2.3983	1650	0.0002	-
2.4709	1700	0.0002	-
2.5436	1750	0.0002	-
2.6163	1800	0.0002	-
2.6890	1850	0.0002	-
2.7616	1900	0.0003	-
2.8343	1950	0.0001	-
2.9070	2000	0.0002	-
2.9797	2050	0.0002	-
3.0523	2100	0.0003	-
3.125	2150	0.0002	-
3.1977	2200	0.0002	-
3.2703	2250	0.0001	-
3.3430	2300	0.0002	-
3.4157	2350	0.0002	-
3.4884	2400	0.0002	-
3.5610	2450	0.0001	-
3.6337	2500	0.0001	-
3.7064	2550	0.0001	-
3.7791	2600	0.0001	-
3.8517	2650	0.0001	-
3.9244	2700	0.0001	-
3.9971	2750	0.0001	-
4.0698	2800	0.0001	-
4.1424	2850	0.0001	-
4.2151	2900	0.0001	-
4.2878	2950	0.0001	-
4.3605	3000	0.0001	-
4.4331	3050	0.0001	-
4.5058	3100	0.0001	-
4.5785	3150	0.0001	-
4.6512	3200	0.0001	-
4.7238	3250	0.0001	-
4.7965	3300	0.0001	-
4.8692	3350	0.0001	-
4.9419	3400	0.0001	-
5.0145	3450	0.0001	-
5.0872	3500	0.0001	-
5.1599	3550	0.0001	-
5.2326	3600	0.0001	-
5.3052	3650	0.0001	-
5.3779	3700	0.0001	-
5.4506	3750	0.0001	-
5.5233	3800	0.0001	-
5.5959	3850	0.0001	-
5.6686	3900	0.0001	-
5.7413	3950	0.0001	-
5.8140	4000	0.0001	-
5.8866	4050	0.0001	-
5.9593	4100	0.0001	-
6.0320	4150	0.0001	-
6.1047	4200	0.0001	-
6.1773	4250	0.0001	-
6.25	4300	0.0001	-
6.3227	4350	0.0001	-
6.3953	4400	0.0001	-
6.4680	4450	0.0001	-
6.5407	4500	0.0001	-
6.6134	4550	0.0001	-
6.6860	4600	0.0001	-
6.7587	4650	0.0001	-
6.8314	4700	0.0001	-
6.9041	4750	0.0001	-
6.9767	4800	0.0	-
7.0494	4850	0.0001	-
7.1221	4900	0.0001	-
7.1948	4950	0.0001	-
7.2674	5000	0.0001	-
7.3401	5050	0.0001	-
7.4128	5100	0.0001	-
7.4855	5150	0.0001	-
7.5581	5200	0.0001	-
7.6308	5250	0.0001	-
7.7035	5300	0.0001	-
7.7762	5350	0.0001	-
7.8488	5400	0.0001	-
7.9215	5450	0.0001	-
7.9942	5500	0.0	-
8.0669	5550	0.0001	-
8.1395	5600	0.0001	-
8.2122	5650	0.0001	-
8.2849	5700	0.0	-
8.3576	5750	0.0001	-
8.4302	5800	0.0001	-
8.5029	5850	0.0001	-
8.5756	5900	0.0001	-
8.6483	5950	0.0001	-
8.7209	6000	0.0001	-
8.7936	6050	0.0001	-
8.8663	6100	0.0	-
8.9390	6150	0.0	-
9.0116	6200	0.0001	-
9.0843	6250	0.0001	-
9.1570	6300	0.0	-
9.2297	6350	0.0	-
9.3023	6400	0.0	-
9.375	6450	0.0001	-
9.4477	6500	0.0001	-
9.5203	6550	0.0001	-
9.5930	6600	0.0001	-
9.6657	6650	0.0001	-
9.7384	6700	0.0001	-
9.8110	6750	0.0001	-
9.8837	6800	0.0001	-
9.9564	6850	0.0	-

Framework Versions

Python: 3.10.12
SetFit: 1.0.3
Sentence Transformers: 2.7.0
Transformers: 4.38.2
PyTorch: 2.2.1+cu121
Datasets: 2.18.0
Tokenizers: 0.15.2

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}