mini1013's picture
Push model using huggingface_hub.
1bd1bdd verified
metadata
base_model: mini1013/master_domain
library_name: setfit
metrics:
  - accuracy
pipeline_tag: text-classification
tags:
  - setfit
  - sentence-transformers
  - text-classification
  - generated_from_setfit_trainer
widget:
  - text: >-
      헤어샵 전용 바이오메드 엘피피 트리트먼트 LPP 실크 트리트먼트1000ml 사은품 증정  (#M)쿠팡
      홈>뷰티>헤어>트리트먼트/팩>일반 트리트먼트 Coupang > 뷰티 > 헤어 > 트리트먼트/팩 > 일반 트리트먼트
  - text: >-
      미쟝센 퍼펙트 세럼 트리트먼트 330ml × 1개 (#M)쿠팡 홈>뷰티>헤어>트리트먼트/팩/앰플>일반 트리트먼트 Coupang >
      뷰티 > 헤어 > 트리트먼트/팩/앰플 > 일반 트리트먼트
  - text: >-
      한소희Pick 로레알파리 토탈리페어5 트리트먼트 헤어팩 400ml 50ml 헤어팩280ml LotteOn > 뷰티 > 헤어/바디 >
      헤어케어 > 트리트먼트/헤어팩 LotteOn > 뷰티 > 헤어/바디 > 헤어케어 > 트리트먼트/헤어팩
  - text: >-
      밀크바오밥 오리지널 샴푸 화이트솝 1L(옵션선택1) 11 트리트먼트 화이트솝 1000ml (#M)헤어케어>샴푸>샴푸바 AD >
      traverse > 11st > 뷰티 > 헤어케어 > 샴푸 > 샴푸바
  - text: >-
      로레알 토탈리페어5 헤어팩 280ml + 170ml  (#M)쿠팡 홈>생활용품>헤어/바디/세안>트리트먼트/팩/앰플>헤어팩/헤어마스크
      Coupang > 뷰티 > 헤어 > 트리트먼트/팩/앰플 > 헤어팩/헤어마스크
inference: true
model-index:
  - name: SetFit with mini1013/master_domain
    results:
      - task:
          type: text-classification
          name: Text Classification
        dataset:
          name: Unknown
          type: unknown
          split: test
        metrics:
          - type: accuracy
            value: 0.8786919831223629
            name: Accuracy

SetFit with mini1013/master_domain

This is a SetFit model that can be used for Text Classification. This SetFit model uses mini1013/master_domain as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Sources

Model Labels

Label Examples
1
  • '[웰라] 염색모전용 SP 컬러 세이브 마스크 400ml (#M)화장품/미용>헤어케어>헤어팩 LO > window_fashion_town > Naverstore > FashionTown > 뷰티 > CATEGORY > 헤어케어 > 트리트먼트/팩 > 헤어팩'
  • '아모스 01 퓨어스마트 샴푸 팩 비듬케어 사춘기샴푸 퓨어 스마트 팩 300ml-비듬두피팩 (#M)홈>화장품/미용>헤어케어>샴푸 Naverstore > 화장품/미용 > 헤어케어 > 샴푸'
  • '미쟝센 데미지 케어 로즈프로틴 헤어팩 150ml × 1개 (#M)쿠팡 홈>생활용품>헤어/바디/세안>트리트먼트/팩/앰플>헤어팩/헤어마스크 Coupang > 뷰티 > 헤어 > 트리트먼트/팩/앰플 > 헤어팩/헤어마스크'
0
  • '스무드 인퓨전 너리싱 스타일링 크림 250ml LotteOn > 뷰티 > 명품화장품 > 헤어케어 LotteOn > 뷰티 > 헤어케어 > 헤어에센스'
  • '체리블라썸/아르간오일 트리트먼트 280ml x2개 02)모로코아르간 트리트먼트 2개 LotteOn > 뷰티 > 헤어케어 > 트리트먼트 LotteOn > 뷰티 > 헤어케어 > 트리트먼트'
  • '[LG생활건강] 비욘드 프로페셔널 디펜스 트리트먼트 500ml LotteOn > 뷰티 > 헤어/바디 > 헤어케어 > 린스 LotteOn > 뷰티 > 헤어/바디 > 헤어케어 > 린스'

Evaluation

Metrics

Label Accuracy
all 0.8787

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("mini1013/master_cate_top_bt13_9_test_flat")
# Run inference
preds = model("미쟝센 퍼펙트 세럼 트리트먼트 330ml × 1개 (#M)쿠팡 홈>뷰티>헤어>트리트먼트/팩/앰플>일반 트리트먼트 Coupang > 뷰티 > 헤어 > 트리트먼트/팩/앰플 > 일반 트리트먼트")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 11 21.07 49
Label Training Sample Count
0 50
1 50

Training Hyperparameters

  • batch_size: (64, 64)
  • num_epochs: (30, 30)
  • max_steps: -1
  • sampling_strategy: oversampling
  • num_iterations: 100
  • body_learning_rate: (2e-05, 1e-05)
  • head_learning_rate: 0.01
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.25
  • end_to_end: False
  • use_amp: False
  • warmup_proportion: 0.1
  • l2_weight: 0.01
  • seed: 42
  • eval_max_steps: -1
  • load_best_model_at_end: False

Training Results

Epoch Step Training Loss Validation Loss
0.0064 1 0.4262 -
0.3185 50 0.4176 -
0.6369 100 0.314 -
0.9554 150 0.0953 -
1.2739 200 0.0302 -
1.5924 250 0.0123 -
1.9108 300 0.0005 -
2.2293 350 0.0002 -
2.5478 400 0.0001 -
2.8662 450 0.0001 -
3.1847 500 0.0001 -
3.5032 550 0.0 -
3.8217 600 0.0001 -
4.1401 650 0.0 -
4.4586 700 0.0 -
4.7771 750 0.0 -
5.0955 800 0.0001 -
5.4140 850 0.0001 -
5.7325 900 0.0 -
6.0510 950 0.0 -
6.3694 1000 0.0 -
6.6879 1050 0.0 -
7.0064 1100 0.0 -
7.3248 1150 0.0 -
7.6433 1200 0.0 -
7.9618 1250 0.0 -
8.2803 1300 0.0 -
8.5987 1350 0.0 -
8.9172 1400 0.0 -
9.2357 1450 0.0 -
9.5541 1500 0.0 -
9.8726 1550 0.0 -
10.1911 1600 0.0 -
10.5096 1650 0.0 -
10.8280 1700 0.0 -
11.1465 1750 0.0 -
11.4650 1800 0.0 -
11.7834 1850 0.0 -
12.1019 1900 0.0 -
12.4204 1950 0.0 -
12.7389 2000 0.0 -
13.0573 2050 0.0 -
13.3758 2100 0.0 -
13.6943 2150 0.0 -
14.0127 2200 0.0 -
14.3312 2250 0.0 -
14.6497 2300 0.0 -
14.9682 2350 0.0 -
15.2866 2400 0.0 -
15.6051 2450 0.0 -
15.9236 2500 0.0 -
16.2420 2550 0.0 -
16.5605 2600 0.0 -
16.8790 2650 0.0 -
17.1975 2700 0.0001 -
17.5159 2750 0.0001 -
17.8344 2800 0.0003 -
18.1529 2850 0.0 -
18.4713 2900 0.0 -
18.7898 2950 0.0 -
19.1083 3000 0.0 -
19.4268 3050 0.0 -
19.7452 3100 0.0001 -
20.0637 3150 0.0002 -
20.3822 3200 0.0 -
20.7006 3250 0.0 -
21.0191 3300 0.0 -
21.3376 3350 0.0 -
21.6561 3400 0.0 -
21.9745 3450 0.0 -
22.2930 3500 0.0 -
22.6115 3550 0.0 -
22.9299 3600 0.0 -
23.2484 3650 0.0 -
23.5669 3700 0.0 -
23.8854 3750 0.0 -
24.2038 3800 0.0 -
24.5223 3850 0.0 -
24.8408 3900 0.0 -
25.1592 3950 0.0 -
25.4777 4000 0.0 -
25.7962 4050 0.0 -
26.1146 4100 0.0 -
26.4331 4150 0.0 -
26.7516 4200 0.0 -
27.0701 4250 0.0 -
27.3885 4300 0.0 -
27.7070 4350 0.0 -
28.0255 4400 0.0 -
28.3439 4450 0.0 -
28.6624 4500 0.0 -
28.9809 4550 0.0 -
29.2994 4600 0.0 -
29.6178 4650 0.0 -
29.9363 4700 0.0 -

Framework Versions

  • Python: 3.10.12
  • SetFit: 1.1.0
  • Sentence Transformers: 3.3.1
  • Transformers: 4.44.2
  • PyTorch: 2.2.0a0+81ea7a4
  • Datasets: 3.2.0
  • Tokenizers: 0.19.1

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}