Chernoffface's picture
Add SetFit model
39f79e6 verified
|
raw
history blame
22.9 kB
metadata
base_model: sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
library_name: setfit
metrics:
  - accuracy
pipeline_tag: text-classification
tags:
  - setfit
  - sentence-transformers
  - text-classification
  - generated_from_setfit_trainer
widget:
  - text: How much should I invest in communication activities?
  - text: In addition, we will consider public reactions and reviews of these works.
  - text: Grundlagen der Fachdidaktik Pädagogik
  - text: >-
      Die Einzelthemen umfassen: * Hard- and Software-Architecture of Modern
      Game Systems * Time Management in Milliseconds * Asset Loading and
      Compression * Physically Based Realtime Rendering and Animations *
      Handling of Large Game Scenes * Audio Simulation and Mixing *
      Constraint-Based Physics Simulation * Artificial Intelligence for Games *
      Multiplayer-Networking * Procedural Content Creation * Integration of
      Scripting Languages * Optimization and parallelization of CPU and GPU Code
      Die Übungen enthalten Theorie- und Praxisanteile.
  - text: >-
      Wie entsteht überhaupt eine Ausstellung und in diesem Fall: eine, die
      weniger auf den Wert des Originals als die Kreativität ihrer Besucher
      setzt?
inference: false

SetFit with sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2

This is a SetFit model that can be used for Text Classification. This SetFit model uses sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2 as the Sentence Transformer embedding model. A MultiOutputClassifier instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Sources

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("Chernoffface/fs-setfit-multilable-model")
# Run inference
preds = model("Grundlagen der Fachdidaktik Pädagogik")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 1 12.9119 131

Training Hyperparameters

  • batch_size: (16, 16)
  • num_epochs: (2, 2)
  • max_steps: -1
  • sampling_strategy: oversampling
  • num_iterations: 40
  • body_learning_rate: (2e-05, 2e-05)
  • head_learning_rate: 2e-05
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.25
  • end_to_end: False
  • use_amp: False
  • warmup_proportion: 0.1
  • l2_weight: 0.01
  • seed: 42
  • eval_max_steps: -1
  • load_best_model_at_end: False

Training Results

Epoch Step Training Loss Validation Loss
0.0001 1 0.1571 -
0.0063 50 0.1986 -
0.0127 100 0.1774 -
0.0190 150 0.136 -
0.0254 200 0.1061 -
0.0317 250 0.0779 -
0.0380 300 0.0671 -
0.0444 350 0.0482 -
0.0507 400 0.0444 -
0.0571 450 0.0427 -
0.0634 500 0.0323 -
0.0698 550 0.0274 -
0.0761 600 0.0301 -
0.0824 650 0.0259 -
0.0888 700 0.0274 -
0.0951 750 0.0305 -
0.1015 800 0.0221 -
0.1078 850 0.0185 -
0.1141 900 0.0208 -
0.1205 950 0.0198 -
0.1268 1000 0.0107 -
0.1332 1050 0.0149 -
0.1395 1100 0.0162 -
0.1458 1150 0.0119 -
0.1522 1200 0.0162 -
0.1585 1250 0.0133 -
0.1649 1300 0.0177 -
0.1712 1350 0.0102 -
0.1776 1400 0.0224 -
0.1839 1450 0.0107 -
0.1902 1500 0.0182 -
0.1966 1550 0.0137 -
0.2029 1600 0.0158 -
0.2093 1650 0.0142 -
0.2156 1700 0.0117 -
0.2219 1750 0.0161 -
0.2283 1800 0.0128 -
0.2346 1850 0.0118 -
0.2410 1900 0.0125 -
0.2473 1950 0.0135 -
0.2536 2000 0.0123 -
0.2600 2050 0.0128 -
0.2663 2100 0.0119 -
0.2727 2150 0.0074 -
0.2790 2200 0.0116 -
0.2854 2250 0.0088 -
0.2917 2300 0.008 -
0.2980 2350 0.0137 -
0.3044 2400 0.0087 -
0.3107 2450 0.0107 -
0.3171 2500 0.0118 -
0.3234 2550 0.0096 -
0.3297 2600 0.0073 -
0.3361 2650 0.0125 -
0.3424 2700 0.0085 -
0.3488 2750 0.0081 -
0.3551 2800 0.0097 -
0.3614 2850 0.0104 -
0.3678 2900 0.0062 -
0.3741 2950 0.0124 -
0.3805 3000 0.0115 -
0.3868 3050 0.012 -
0.3932 3100 0.0147 -
0.3995 3150 0.0097 -
0.4058 3200 0.0107 -
0.4122 3250 0.0074 -
0.4185 3300 0.013 -
0.4249 3350 0.0115 -
0.4312 3400 0.008 -
0.4375 3450 0.0087 -
0.4439 3500 0.0099 -
0.4502 3550 0.0076 -
0.4566 3600 0.0118 -
0.4629 3650 0.013 -
0.4692 3700 0.0107 -
0.4756 3750 0.0123 -
0.4819 3800 0.0101 -
0.4883 3850 0.0095 -
0.4946 3900 0.01 -
0.5010 3950 0.0068 -
0.5073 4000 0.0064 -
0.5136 4050 0.0096 -
0.5200 4100 0.0063 -
0.5263 4150 0.0083 -
0.5327 4200 0.0067 -
0.5390 4250 0.0095 -
0.5453 4300 0.0097 -
0.5517 4350 0.0057 -
0.5580 4400 0.0101 -
0.5644 4450 0.0101 -
0.5707 4500 0.0043 -
0.5770 4550 0.0099 -
0.5834 4600 0.0091 -
0.5897 4650 0.0065 -
0.5961 4700 0.0071 -
0.6024 4750 0.0035 -
0.6088 4800 0.0088 -
0.6151 4850 0.0079 -
0.6214 4900 0.0094 -
0.6278 4950 0.0105 -
0.6341 5000 0.0091 -
0.6405 5050 0.0109 -
0.6468 5100 0.0081 -
0.6531 5150 0.0087 -
0.6595 5200 0.0091 -
0.6658 5250 0.0071 -
0.6722 5300 0.0072 -
0.6785 5350 0.0084 -
0.6848 5400 0.0099 -
0.6912 5450 0.004 -
0.6975 5500 0.0038 -
0.7039 5550 0.0072 -
0.7102 5600 0.0084 -
0.7166 5650 0.004 -
0.7229 5700 0.0077 -
0.7292 5750 0.0066 -
0.7356 5800 0.0043 -
0.7419 5850 0.0054 -
0.7483 5900 0.0107 -
0.7546 5950 0.0046 -
0.7609 6000 0.0075 -
0.7673 6050 0.0106 -
0.7736 6100 0.0063 -
0.7800 6150 0.007 -
0.7863 6200 0.0066 -
0.7926 6250 0.0067 -
0.7990 6300 0.0078 -
0.8053 6350 0.0093 -
0.8117 6400 0.0055 -
0.8180 6450 0.0074 -
0.8244 6500 0.0115 -
0.8307 6550 0.0058 -
0.8370 6600 0.005 -
0.8434 6650 0.007 -
0.8497 6700 0.0053 -
0.8561 6750 0.0086 -
0.8624 6800 0.0054 -
0.8687 6850 0.0055 -
0.8751 6900 0.006 -
0.8814 6950 0.0068 -
0.8878 7000 0.0103 -
0.8941 7050 0.0054 -
0.9004 7100 0.007 -
0.9068 7150 0.0047 -
0.9131 7200 0.0076 -
0.9195 7250 0.0077 -
0.9258 7300 0.0058 -
0.9321 7350 0.0056 -
0.9385 7400 0.0041 -
0.9448 7450 0.0062 -
0.9512 7500 0.0044 -
0.9575 7550 0.0042 -
0.9639 7600 0.0095 -
0.9702 7650 0.0045 -
0.9765 7700 0.0062 -
0.9829 7750 0.0036 -
0.9892 7800 0.0086 -
0.9956 7850 0.0071 -
1.0019 7900 0.0103 -
1.0082 7950 0.004 -
1.0146 8000 0.0059 -
1.0209 8050 0.0053 -
1.0273 8100 0.0079 -
1.0336 8150 0.0078 -
1.0399 8200 0.0077 -
1.0463 8250 0.0062 -
1.0526 8300 0.005 -
1.0590 8350 0.0071 -
1.0653 8400 0.0042 -
1.0717 8450 0.0054 -
1.0780 8500 0.0048 -
1.0843 8550 0.0045 -
1.0907 8600 0.0062 -
1.0970 8650 0.0094 -
1.1034 8700 0.0043 -
1.1097 8750 0.004 -
1.1160 8800 0.003 -
1.1224 8850 0.0026 -
1.1287 8900 0.0051 -
1.1351 8950 0.0046 -
1.1414 9000 0.0046 -
1.1477 9050 0.0075 -
1.1541 9100 0.0066 -
1.1604 9150 0.0078 -
1.1668 9200 0.0069 -
1.1731 9250 0.0087 -
1.1795 9300 0.0047 -
1.1858 9350 0.0037 -
1.1921 9400 0.007 -
1.1985 9450 0.0069 -
1.2048 9500 0.0061 -
1.2112 9550 0.0047 -
1.2175 9600 0.0065 -
1.2238 9650 0.0058 -
1.2302 9700 0.0061 -
1.2365 9750 0.0055 -
1.2429 9800 0.0064 -
1.2492 9850 0.0041 -
1.2555 9900 0.0086 -
1.2619 9950 0.0053 -
1.2682 10000 0.0047 -
1.2746 10050 0.0053 -
1.2809 10100 0.003 -
1.2873 10150 0.0046 -
1.2936 10200 0.0052 -
1.2999 10250 0.0056 -
1.3063 10300 0.0052 -
1.3126 10350 0.0079 -
1.3190 10400 0.006 -
1.3253 10450 0.0055 -
1.3316 10500 0.0066 -
1.3380 10550 0.0076 -
1.3443 10600 0.0037 -
1.3507 10650 0.0066 -
1.3570 10700 0.0059 -
1.3633 10750 0.0057 -
1.3697 10800 0.0038 -
1.3760 10850 0.0044 -
1.3824 10900 0.0059 -
1.3887 10950 0.0073 -
1.3951 11000 0.0055 -
1.4014 11050 0.0039 -
1.4077 11100 0.0054 -
1.4141 11150 0.0068 -
1.4204 11200 0.0067 -
1.4268 11250 0.0041 -
1.4331 11300 0.0076 -
1.4394 11350 0.0071 -
1.4458 11400 0.0044 -
1.4521 11450 0.0061 -
1.4585 11500 0.0039 -
1.4648 11550 0.006 -
1.4711 11600 0.0045 -
1.4775 11650 0.0044 -
1.4838 11700 0.0063 -
1.4902 11750 0.0061 -
1.4965 11800 0.0058 -
1.5029 11850 0.0039 -
1.5092 11900 0.0041 -
1.5155 11950 0.0052 -
1.5219 12000 0.0034 -
1.5282 12050 0.0078 -
1.5346 12100 0.0049 -
1.5409 12150 0.0064 -
1.5472 12200 0.0063 -
1.5536 12250 0.0068 -
1.5599 12300 0.008 -
1.5663 12350 0.0043 -
1.5726 12400 0.0057 -
1.5789 12450 0.0044 -
1.5853 12500 0.0048 -
1.5916 12550 0.0049 -
1.5980 12600 0.0052 -
1.6043 12650 0.0061 -
1.6107 12700 0.0066 -
1.6170 12750 0.0079 -
1.6233 12800 0.0047 -
1.6297 12850 0.005 -
1.6360 12900 0.0034 -
1.6424 12950 0.0051 -
1.6487 13000 0.006 -
1.6550 13050 0.0046 -
1.6614 13100 0.003 -
1.6677 13150 0.0055 -
1.6741 13200 0.0069 -
1.6804 13250 0.0033 -
1.6867 13300 0.0095 -
1.6931 13350 0.0043 -
1.6994 13400 0.0055 -
1.7058 13450 0.0081 -
1.7121 13500 0.0042 -
1.7185 13550 0.0081 -
1.7248 13600 0.0055 -
1.7311 13650 0.0043 -
1.7375 13700 0.0033 -
1.7438 13750 0.0044 -
1.7502 13800 0.0062 -
1.7565 13850 0.0032 -
1.7628 13900 0.0043 -
1.7692 13950 0.0079 -
1.7755 14000 0.0053 -
1.7819 14050 0.0044 -
1.7882 14100 0.0064 -
1.7945 14150 0.0051 -
1.8009 14200 0.0088 -
1.8072 14250 0.0048 -
1.8136 14300 0.0044 -
1.8199 14350 0.0071 -
1.8263 14400 0.0058 -
1.8326 14450 0.007 -
1.8389 14500 0.0028 -
1.8453 14550 0.0046 -
1.8516 14600 0.0061 -
1.8580 14650 0.0054 -
1.8643 14700 0.004 -
1.8706 14750 0.0034 -
1.8770 14800 0.0044 -
1.8833 14850 0.0033 -
1.8897 14900 0.007 -
1.8960 14950 0.0044 -
1.9023 15000 0.0045 -
1.9087 15050 0.0045 -
1.9150 15100 0.0093 -
1.9214 15150 0.0036 -
1.9277 15200 0.0055 -
1.9341 15250 0.0037 -
1.9404 15300 0.0043 -
1.9467 15350 0.0034 -
1.9531 15400 0.0068 -
1.9594 15450 0.0058 -
1.9658 15500 0.0069 -
1.9721 15550 0.0081 -
1.9784 15600 0.0061 -
1.9848 15650 0.0039 -
1.9911 15700 0.0065 -
1.9975 15750 0.0048 -

Framework Versions

  • Python: 3.12.3
  • SetFit: 1.1.0
  • Sentence Transformers: 3.2.0
  • Transformers: 4.45.2
  • PyTorch: 2.5.0+cu121
  • Datasets: 3.0.1
  • Tokenizers: 0.20.1

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}