SetFit with mini1013/master_domain

This is a SetFit model that can be used for Text Classification. This SetFit model uses mini1013/master_domain as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

Fine-tuning a Sentence Transformer with contrastive learning.
Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Type: SetFit
Sentence Transformer body: mini1013/master_domain
Classification head: a LogisticRegression instance
Maximum Sequence Length: 512 tokens
Number of Classes: 2 classes

Model Sources

Repository: SetFit on GitHub
Paper: Efficient Few-Shot Learning Without Prompts
Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts

Model Labels

Label	Examples
1	'[웰라] 염색모전용 SP 컬러 세이브 마스크 400ml (#M)화장품/미용>헤어케어>헤어팩 LO > window_fashion_town > Naverstore > FashionTown > 뷰티 > CATEGORY > 헤어케어 > 트리트먼트/팩 > 헤어팩' '아모스 01 퓨어스마트 샴푸 팩 비듬케어 사춘기샴푸 퓨어 스마트 팩 300ml-비듬두피팩 (#M)홈>화장품/미용>헤어케어>샴푸 Naverstore > 화장품/미용 > 헤어케어 > 샴푸' '미쟝센 데미지 케어 로즈프로틴 헤어팩 150ml × 1개 (#M)쿠팡 홈>생활용품>헤어/바디/세안>트리트먼트/팩/앰플>헤어팩/헤어마스크 Coupang > 뷰티 > 헤어 > 트리트먼트/팩/앰플 > 헤어팩/헤어마스크'
0	'스무드 인퓨전 너리싱 스타일링 크림 250ml LotteOn > 뷰티 > 명품화장품 > 헤어케어 LotteOn > 뷰티 > 헤어케어 > 헤어에센스' '체리블라썸/아르간오일 트리트먼트 280ml x2개 02)모로코아르간 트리트먼트 2개 LotteOn > 뷰티 > 헤어케어 > 트리트먼트 LotteOn > 뷰티 > 헤어케어 > 트리트먼트' '[LG생활건강] 비욘드 프로페셔널 디펜스 트리트먼트 500ml LotteOn > 뷰티 > 헤어/바디 > 헤어케어 > 린스 LotteOn > 뷰티 > 헤어/바디 > 헤어케어 > 린스'

Label

Examples

'[웰라] 염색모전용 SP 컬러 세이브 마스크 400ml (#M)화장품/미용>헤어케어>헤어팩 LO > window_fashion_town > Naverstore > FashionTown > 뷰티 > CATEGORY > 헤어케어 > 트리트먼트/팩 > 헤어팩'
'아모스 01 퓨어스마트 샴푸 팩 비듬케어 사춘기샴푸 퓨어 스마트 팩 300ml-비듬두피팩 (#M)홈>화장품/미용>헤어케어>샴푸 Naverstore > 화장품/미용 > 헤어케어 > 샴푸'
'미쟝센 데미지 케어 로즈프로틴 헤어팩 150ml × 1개 (#M)쿠팡 홈>생활용품>헤어/바디/세안>트리트먼트/팩/앰플>헤어팩/헤어마스크 Coupang > 뷰티 > 헤어 > 트리트먼트/팩/앰플 > 헤어팩/헤어마스크'

'스무드 인퓨전 너리싱 스타일링 크림 250ml LotteOn > 뷰티 > 명품화장품 > 헤어케어 LotteOn > 뷰티 > 헤어케어 > 헤어에센스'
'체리블라썸/아르간오일 트리트먼트 280ml x2개 02)모로코아르간 트리트먼트 2개 LotteOn > 뷰티 > 헤어케어 > 트리트먼트 LotteOn > 뷰티 > 헤어케어 > 트리트먼트'
'[LG생활건강] 비욘드 프로페셔널 디펜스 트리트먼트 500ml LotteOn > 뷰티 > 헤어/바디 > 헤어케어 > 린스 LotteOn > 뷰티 > 헤어/바디 > 헤어케어 > 린스'

Evaluation

Metrics

Label	Accuracy
all	0.8787

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("mini1013/master_cate_top_bt13_9_test_flat")
# Run inference
preds = model("미쟝센 퍼펙트 세럼 트리트먼트 330ml × 1개 (#M)쿠팡 홈>뷰티>헤어>트리트먼트/팩/앰플>일반 트리트먼트 Coupang > 뷰티 > 헤어 > 트리트먼트/팩/앰플 > 일반 트리트먼트")

Training Details

Training Set Metrics

Training set	Min	Median	Max
Word count	11	21.07	49

Label	Training Sample Count
0	50
1	50

Training Hyperparameters

batch_size: (64, 64)
num_epochs: (30, 30)
max_steps: -1
sampling_strategy: oversampling
num_iterations: 100
body_learning_rate: (2e-05, 1e-05)
head_learning_rate: 0.01
loss: CosineSimilarityLoss
distance_metric: cosine_distance
margin: 0.25
end_to_end: False
use_amp: False
warmup_proportion: 0.1
l2_weight: 0.01
seed: 42
eval_max_steps: -1
load_best_model_at_end: False

Training Results

Epoch	Step	Training Loss	Validation Loss
0.0064	1	0.4262	-
0.3185	50	0.4176	-
0.6369	100	0.314	-
0.9554	150	0.0953	-
1.2739	200	0.0302	-
1.5924	250	0.0123	-
1.9108	300	0.0005	-
2.2293	350	0.0002	-
2.5478	400	0.0001	-
2.8662	450	0.0001	-
3.1847	500	0.0001	-
3.5032	550	0.0	-
3.8217	600	0.0001	-
4.1401	650	0.0	-
4.4586	700	0.0	-
4.7771	750	0.0	-
5.0955	800	0.0001	-
5.4140	850	0.0001	-
5.7325	900	0.0	-
6.0510	950	0.0	-
6.3694	1000	0.0	-
6.6879	1050	0.0	-
7.0064	1100	0.0	-
7.3248	1150	0.0	-
7.6433	1200	0.0	-
7.9618	1250	0.0	-
8.2803	1300	0.0	-
8.5987	1350	0.0	-
8.9172	1400	0.0	-
9.2357	1450	0.0	-
9.5541	1500	0.0	-
9.8726	1550	0.0	-
10.1911	1600	0.0	-
10.5096	1650	0.0	-
10.8280	1700	0.0	-
11.1465	1750	0.0	-
11.4650	1800	0.0	-
11.7834	1850	0.0	-
12.1019	1900	0.0	-
12.4204	1950	0.0	-
12.7389	2000	0.0	-
13.0573	2050	0.0	-
13.3758	2100	0.0	-
13.6943	2150	0.0	-
14.0127	2200	0.0	-
14.3312	2250	0.0	-
14.6497	2300	0.0	-
14.9682	2350	0.0	-
15.2866	2400	0.0	-
15.6051	2450	0.0	-
15.9236	2500	0.0	-
16.2420	2550	0.0	-
16.5605	2600	0.0	-
16.8790	2650	0.0	-
17.1975	2700	0.0001	-
17.5159	2750	0.0001	-
17.8344	2800	0.0003	-
18.1529	2850	0.0	-
18.4713	2900	0.0	-
18.7898	2950	0.0	-
19.1083	3000	0.0	-
19.4268	3050	0.0	-
19.7452	3100	0.0001	-
20.0637	3150	0.0002	-
20.3822	3200	0.0	-
20.7006	3250	0.0	-
21.0191	3300	0.0	-
21.3376	3350	0.0	-
21.6561	3400	0.0	-
21.9745	3450	0.0	-
22.2930	3500	0.0	-
22.6115	3550	0.0	-
22.9299	3600	0.0	-
23.2484	3650	0.0	-
23.5669	3700	0.0	-
23.8854	3750	0.0	-
24.2038	3800	0.0	-
24.5223	3850	0.0	-
24.8408	3900	0.0	-
25.1592	3950	0.0	-
25.4777	4000	0.0	-
25.7962	4050	0.0	-
26.1146	4100	0.0	-
26.4331	4150	0.0	-
26.7516	4200	0.0	-
27.0701	4250	0.0	-
27.3885	4300	0.0	-
27.7070	4350	0.0	-
28.0255	4400	0.0	-
28.3439	4450	0.0	-
28.6624	4500	0.0	-
28.9809	4550	0.0	-
29.2994	4600	0.0	-
29.6178	4650	0.0	-
29.9363	4700	0.0	-

Framework Versions

Python: 3.10.12
SetFit: 1.1.0
Sentence Transformers: 3.3.1
Transformers: 4.44.2
PyTorch: 2.2.0a0+81ea7a4
Datasets: 3.2.0
Tokenizers: 0.19.1

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}

mini1013
/

master_cate_top_bt13_9_test_flat