Edit model card

SetFit with Alibaba-NLP/gte-base-en-v1.5

This is a SetFit model trained on the diwank/hn-upvote-data dataset that can be used for Text Classification. This SetFit model uses Alibaba-NLP/gte-base-en-v1.5 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Sources

Model Labels

Label Examples
0
  • 'The telltale words that could identify generative AI text'
  • 'The telltale words that could identify generative AI text'
  • 'The telltale words that could identify generative AI text'
1
  • 'Dangerous Feelings\nSource: www.collaborativefund.com'
  • 'The Modos Paper Monitor\nSource: www.modos.tech'
  • 'What did Mary know? A thought experiment about consciousness (2013)\nSource: philosophynow.org'

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("diwank/hn-upvote-classifier")
# Run inference
preds = model("My Python code is a neural network")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 3 8.6577 18
Label Training Sample Count
0 4577
1 252

Training Hyperparameters

  • batch_size: (320, 32)
  • num_epochs: (1, 16)
  • max_steps: -1
  • sampling_strategy: undersampling
  • body_learning_rate: (4e-05, 2e-05)
  • head_learning_rate: 0.01
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.25
  • end_to_end: True
  • use_amp: True
  • warmup_proportion: 0.05
  • l2_weight: 0.2
  • seed: 42
  • eval_max_steps: -1
  • load_best_model_at_end: True

Training Results

Epoch Step Training Loss Validation Loss
0.0001 1 0.208 -
0.0069 50 0.0121 -
0.0139 100 0.002 -
0.0208 150 0.0032 -
0.0277 200 0.001 -
0.0347 250 0.0006 -
0.0416 300 0.0005 -
0.0486 350 0.0004 -
0.0555 400 0.0003 -
0.0624 450 0.0002 -
0.0694 500 0.0002 -
0.0763 550 0.0002 -
0.0832 600 0.0002 -
0.0902 650 0.0001 -
0.0971 700 0.0001 -
0.1040 750 0.0001 -
0.1110 800 0.0001 -
0.1179 850 0.0001 -
0.1248 900 0.0001 -
0.1318 950 0.0001 -
0.1387 1000 0.0001 -
0.1457 1050 0.0001 -
0.1526 1100 0.0001 -
0.1595 1150 0.0001 -
0.1665 1200 0.0001 -
0.1734 1250 0.0001 -
0.1803 1300 0.0001 -
0.1873 1350 0.0001 -
0.1942 1400 0.0001 -
0.2011 1450 0.0001 -
0.2081 1500 0.0001 -
0.2150 1550 0.0001 -
0.2219 1600 0.0 -
0.2289 1650 0.0 -
0.2358 1700 0.0 -
0.2428 1750 0.0 -
0.2497 1800 0.0001 -
0.2566 1850 0.0 -
0.2636 1900 0.0 -
0.2705 1950 0.0 -
0.2774 2000 0.0 -
0.2844 2050 0.0 -
0.2913 2100 0.0 -
0.2982 2150 0.0 -
0.3052 2200 0.0 -
0.3121 2250 0.0 -
0.3190 2300 0.0 -
0.3260 2350 0.0 -
0.3329 2400 0.0 -
0.3399 2450 0.0 -
0.3468 2500 0.0 -
0.3537 2550 0.0 -
0.3607 2600 0.0 -
0.3676 2650 0.0 -
0.3745 2700 0.0 -
0.3815 2750 0.0 -
0.3884 2800 0.0 -
0.3953 2850 0.0 -
0.4023 2900 0.0 -
0.4092 2950 0.0 -
0.4161 3000 0.0 -
0.4231 3050 0.0 -
0.4300 3100 0.0 -
0.4370 3150 0.0 -
0.4439 3200 0.0 -
0.4508 3250 0.0 -
0.4578 3300 0.0 -
0.4647 3350 0.0 -
0.4716 3400 0.0 -
0.4786 3450 0.0 -
0.4855 3500 0.0 -
0.4924 3550 0.0 -
0.4994 3600 0.0 -
0.5063 3650 0.0 -
0.5132 3700 0.0 -
0.5202 3750 0.0 -
0.5271 3800 0.0 -
0.5341 3850 0.0 -
0.5410 3900 0.0 -
0.5479 3950 0.0 -
0.5549 4000 0.0 -
0.5618 4050 0.0 -
0.5687 4100 0.0 -
0.5757 4150 0.0 -
0.5826 4200 0.0 -
0.5895 4250 0.0 -
0.5965 4300 0.0 -
0.6034 4350 0.0 -
0.6103 4400 0.0 -
0.6173 4450 0.0 -
0.6242 4500 0.0 -
0.6312 4550 0.0 -
0.6381 4600 0.0 -
0.6450 4650 0.0 -
0.6520 4700 0.0 -
0.6589 4750 0.0 -
0.6658 4800 0.0 -
0.6728 4850 0.0 -
0.6797 4900 0.0 -
0.6866 4950 0.0 -
0.6936 5000 0.0 -
0.7005 5050 0.0 -
0.7074 5100 0.0 -
0.7144 5150 0.0 -
0.7213 5200 0.0 -
0.7283 5250 0.0 -
0.7352 5300 0.0 -
0.7421 5350 0.0 -
0.7491 5400 0.0 -
0.7560 5450 0.0 -
0.7629 5500 0.0 -
0.7699 5550 0.0 -
0.7768 5600 0.0 -
0.7837 5650 0.0 -
0.7907 5700 0.0 -
0.7976 5750 0.0 -
0.8045 5800 0.0 -
0.8115 5850 0.0 -
0.8184 5900 0.0 -
0.8254 5950 0.0 -
0.8323 6000 0.0 -
0.8392 6050 0.0 -
0.8462 6100 0.0 -
0.8531 6150 0.0 -
0.8600 6200 0.0 -
0.8670 6250 0.0 -
0.8739 6300 0.0 -
0.8808 6350 0.0 -
0.8878 6400 0.0 -
0.8947 6450 0.0 -
0.9017 6500 0.0 -
0.9086 6550 0.0 -
0.9155 6600 0.0 -
0.9225 6650 0.0 -
0.9294 6700 0.0 -
0.9363 6750 0.0 -
0.9433 6800 0.0 -
0.9502 6850 0.0 -
0.9571 6900 0.0 -
0.9641 6950 0.0 -
0.9710 7000 0.0 -
0.9779 7050 0.0 -
0.9849 7100 0.0 -
0.9918 7150 0.0 -
0.9988 7200 0.0 -

Framework Versions

  • Python: 3.10.14
  • SetFit: 1.0.3
  • Sentence Transformers: 3.0.1
  • Transformers: 4.41.2
  • PyTorch: 2.3.1+cu121
  • Datasets: 2.20.0
  • Tokenizers: 0.19.1

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}
Downloads last month
2
Safetensors
Model size
137M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for diwank/hn-upvote-classifier

Finetuned
(10)
this model

Dataset used to train diwank/hn-upvote-classifier