Edit model card

SetFit

This is a SetFit model that can be used for Text Classification. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

  • Model Type: SetFit
  • Classification head: a LogisticRegression instance
  • Maximum Sequence Length: 128 tokens
  • Number of Classes: 44 classes

Model Sources

Model Labels

Label Examples
Shopping / electronics & multimedia
  • 'achat dji technology carte chn'
  • 'facture carte samsung paris opera carte'
Other / kids
  • 'virement sortant cadeau anniversaire neveu'
  • 'paiement carte lunapark family fun carte'
Bank services / other
  • 'paiement frais demande rib iban supplémentaires carte'
  • 'frais changement de pin carte'
Housing / rent
  • 'paiement loyer rue des oliviers carte'
  • 'sepa regl loyer resid les ormeaux carte'
Transportation / other
  • 'parking aeroport charles de gaulle carte'
  • 'frais douane import vehicule usa carte usd commission'
Bank services / transfers
  • 'transfer location vacances famille roux carte'
  • 'virement sepa entrant de loyer mars carte'
Investment / retirement & savings
  • 'alimentation plan epargne logement carte'
  • 'allocation retraite complémentaire carte'
Other / taxes
  • 'contribution economique territoriale siret frcte'
  • 'taxe apprentissage siret frapp'
Healthy & Beauty / other
  • 'adhésion club randonnée plein air'
  • 'achat en ligne produits aromatherapie naturesence carte'
Investment / securities
  • 'investissement silver etf carte silver oz'
  • 'transaction actions netflix carte usd'
Housing / other
  • 'virement recu du remboursement depot de garantie'
  • 'prlv sepa du alarmes securitas direct'
Housing / house loan
  • 'solde emprunt habitat fortuneo pret'
  • 'prelevement sepa pret habitation hsbc france'
Housing / utilities & bills
  • 'prlv sepa grdf'
  • 'prlv sepa total direct energie elec'
Bank services / general fees
  • 'frais opposition cheque perdu'
  • 'frais de gestion portefeuille titres'
Leisure & Entertainment / culture & events
  • 'prlv sepa cinema cgr lille'
  • 'achat carte festival rock en seine carte'
Transportation / taxi & carpool
  • 'prlv sepa blablacar carte'
  • 'facture carte du kakao taxi seoul carte kor krw commission'
Shopping / other
  • 'achat coffrets cadeaux pandore carte'
  • 'facture carte du magasin l unique montpellier carte'
Recurrent Payments / loans
  • 'retrait auto emma pret familial emmaprt carte'
  • 'paiement échéance axa pret professionnel carte'
Healthy & Beauty / doctor fees
  • 'facture carte du dr pierre neurologue carte'
  • 'facture carte du dr marchand orthopediste carte'
Bank services / withdrawal
  • 'retrait dab banque express toulouse carte fr'
  • 'retrait dab ecobanque lyon carte fr'
Other / other
  • 'facture carte du cinema rexy paris carte'
  • 'don association sos villages enfants'
Healthy & Beauty / pharmacy
  • 'prlv sepa pharmacie azureech'
  • 'debit carte pharmacie grand ciel carte'
Transportation / fuel
  • 'facture carte du total energies paris carte'
  • 'prlv sepa du q bruxelles carte bel'
Shopping / sporting goods
  • 'pmt carte fitnessboutique lyon carte'
  • 'paiement carte go sport montpellier carte'
Food & Drinks / groceries
  • 'facture carte du magasin asiatique lee carte'
  • 'debit charcuterie gourmets carte'
Other / pets
  • 'prlv sepa soins veterinaires urgences'
  • 'achat académie dressage canin carte'
Investment / real estate
  • 'virement sortant investissement immobilier crowdfunding carte'
  • 'virement recu vente local commercial nice carte'
Shopping / clothing
  • 'achat decathlon carte'
  • 'achat carte nike store carte usa usd commission'
Shopping / housing equipment
  • 'facture carte du conforama montpellier carte'
  • 'paiement par carte ambiances matieres marseille carte'
Transportation / maitenance
  • 'facture du vitres teintees luxe bordeaux carte'
  • 'debit du garage turbo moteurs strasbourg carte remise a neuf'
Recurrent Payments / other
  • 'abonnement annuel magazine interstellar transaction date'
  • 'cotisation annuelle club échecs rois et pions date'
Recurrent Payments / insurance
  • 'prelevement sepa assurance multirisque pro mma'
  • 'prélèvement mensuel assurance collective cnp'
Healthy & Beauty / veterinary
  • 'deworming petcare lyon carte'
  • 'prlv sepa hospital vet duval limoges'
Transportation / public transportation
  • 'achat titres v ville de lille carte'
  • 'abonnement tram strasbourg cts carte'
Healthy & Beauty / beauty & self-care
  • 'prlv sepa abonnement biotyfull box'
  • 'facture carte du mac cosmetics nice carte'
Leisure & Entertainment / other
  • 'paiement en ligne du amazon prime video carte usa'
  • 'facture carte du spotify premium carte usa'
Food & Drinks / eating out
  • 'facture carte du cafe de flore carte'
  • 'facture carte du mcdonald s carte usa usd commission'
Housing / services & maintenance
  • 'prlv sepa electricite generale flash'
  • 'virement recu soldes tuyauterie moderne'
Leisure & Entertainment / travel
  • 'prlv sepa eurostar'
  • 'achat carte hertz location carte usa usd commission'
Leisure & Entertainment / sports & hobbies
  • 'paiement en ligne du adidas fr carte'
  • 'facture carte du culture velo lyon carte'
Investment / other
  • 'souscription part sociale coop biolocal'
  • 'participation crowdfunding waterclean projet'
Transportation / car loan & leasing
  • 'virement mensualite bmw x debmwx'
  • 'prlv sepa dacia lodgy crdit auto'
Recurrent Payments / subscription
  • 'prlv sepa microsoft office svc carte'
  • 'facture carte du adobe creative cloud photo carte'
Food & Drinks / other
  • 'facture carte du café de flore carte'
  • 'debit carte caviste le grand cru carte'

Evaluation

Metrics

Label Accuracy
all 0.25

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("HEN10/setfit-particular-transaction-solon-embeddings-labels-large-kaggle-automatisation-v1")
# Run inference
preds = model("achat académie dressage canin carte")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 3 6.0455 10
Label Training Sample Count
Housing / rent 2
Housing / house loan 2
Housing / utilities & bills 2
Housing / services & maintenance 2
Housing / other 2
Food & Drinks / groceries 2
Food & Drinks / eating out 2
Food & Drinks / other 2
Leisure & Entertainment / sports & hobbies 2
Leisure & Entertainment / culture & events 2
Leisure & Entertainment / travel 2
Leisure & Entertainment / other 2
Transportation / car loan & leasing 2
Transportation / fuel 2
Transportation / public transportation 2
Transportation / taxi & carpool 2
Transportation / maitenance 2
Transportation / other 2
Recurrent Payments / loans 2
Recurrent Payments / insurance 2
Recurrent Payments / subscription 2
Recurrent Payments / other 2
Investment / securities 2
Investment / retirement & savings 2
Investment / real estate 2
Investment / other 2
Shopping / clothing 2
Shopping / electronics & multimedia 2
Shopping / sporting goods 2
Shopping / housing equipment 2
Shopping / other 2
Healthy & Beauty / doctor fees 2
Healthy & Beauty / pharmacy 2
Healthy & Beauty / beauty & self-care 2
Healthy & Beauty / veterinary 2
Healthy & Beauty / other 2
Bank services / transfers 2
Bank services / withdrawal 2
Bank services / general fees 2
Bank services / other 2
Other / taxes 2
Other / kids 2
Other / pets 2
Other / other 2

Training Hyperparameters

  • batch_size: (16, 16)
  • num_epochs: (1, 1)
  • max_steps: -1
  • sampling_strategy: oversampling
  • body_learning_rate: (2e-05, 1e-05)
  • head_learning_rate: 0.01
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.25
  • end_to_end: True
  • use_amp: False
  • warmup_proportion: 0.1
  • seed: 6
  • eval_max_steps: -1
  • load_best_model_at_end: False

Training Results

Epoch Step Training Loss Validation Loss
0.0021 1 0.1662 -
0.1057 50 0.1483 -
0.2114 100 0.0681 -
0.3171 150 0.0298 -
0.4228 200 0.0245 -
0.5285 250 0.0117 -
0.6342 300 0.032 -
0.7400 350 0.0112 -
0.8457 400 0.0072 -
0.9514 450 0.0176 -

Framework Versions

  • Python: 3.10.13
  • SetFit: 1.0.3
  • Sentence Transformers: 2.6.1
  • Transformers: 4.39.3
  • PyTorch: 2.1.2
  • Datasets: 2.17.0
  • Tokenizers: 0.15.2

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}
Downloads last month
3
Safetensors
Model size
33.4M params
Tensor type
F32
·

Evaluation results