diff --git "a/README.md" "b/README.md" new file mode 100644--- /dev/null +++ "b/README.md" @@ -0,0 +1,331 @@ +--- +tags: +- setfit +- sentence-transformers +- text-classification +- generated_from_setfit_trainer +widget: +- text: ' (Karte + Zahlen) [LinearLayout|WebView]' +- text: ' (Hallo, ) [FrameLayout|WebView] | (Zurück zur vorherigen Seite) [ImageButton|WebView] + | Du hast () [TextView|WebView] | 9 () [TextView|WebView] | °Punkte. (Punkte.) + [TextView|WebView] | (Weitere Informationen öffnen) [ImageView|WebView] | Profil + vervollständigen () [TextView|WebView] | (Profilvervollständigung ausblenden + – wird nach erneuter Anmeldung wieder eingeblendet.) [ImageView|WebView] | 29 + % () [TextView|WebView] | erledigt () [TextView|WebView] | Klasse, Du bist auf + einem guten Weg. Weiter so! () [TextView|WebView] | Meine Funktionen () [TextView|WebView] + | Meine persönlichen Daten () [TextView|WebView] | 1 () [TextView|WebView] | Meine + Kassenbons () [TextView|WebView] | Mein PAYBACK () [TextView|WebView] | (Rabattkarte + teilen oder drucken) [ViewGroup|WebView] | Netto plus Karte zum Ausdrucken () + [TextView|WebView] | Meine Bezahloptionen () [TextView|WebView] | Meine Profilvervollständigung + () [TextView|WebView] | 29 % eingerichtet () [TextView|WebView] | Hilfe & Kontakt + () [TextView|WebView] | Meine Einstellungen () [TextView|WebView] | Meine Vorteilswelt + () [TextView|WebView] | Abmelden () [TextView|WebView] | °Punkte sammeln (Punkte + sammeln) [TextView|WebView] | mit attraktiven Coupons () [TextView|WebView] | + Weiter () [TextView|WebView]' +- text: ' (Hallo, ) [FrameLayout|WebView] | (Zurück zur vorherigen Seite) [ImageButton|WebView] + | Du hast () [TextView|WebView] | 9 () [TextView|WebView] | °Punkte. (Punkte.) + [TextView|WebView] | (Weitere Informationen öffnen) [ImageView|WebView] | Profil + vervollständigen () [TextView|WebView] | (Profilvervollständigung ausblenden + – wird nach erneuter Anmeldung wieder eingeblendet.) [ImageView|WebView] | 29 + % () [TextView|WebView] | erledigt () [TextView|WebView] | Klasse, Du bist auf + einem guten Weg. Weiter so! () [TextView|WebView] | Meine Funktionen () [TextView|WebView] + | Meine persönlichen Daten () [TextView|WebView] | 1 () [TextView|WebView] | Meine + Kassenbons () [TextView|WebView] | Mein PAYBACK () [TextView|WebView] | (Rabattkarte + teilen oder drucken) [ViewGroup|WebView] | Netto plus Karte zum Ausdrucken () + [TextView|WebView] | Meine Bezahloptionen () [TextView|WebView] | Meine Profilvervollständigung + () [TextView|WebView] | 29 % eingerichtet () [TextView|WebView] | Hilfe & Kontakt + () [TextView|WebView] | Meine Einstellungen () [TextView|WebView] | Meine Vorteilswelt + () [TextView|WebView] | Abmelden () [TextView|WebView] | °Punkte sammeln (Punkte + sammeln) [TextView|WebView] | mit attraktiven Coupons () [TextView|WebView] | + Weiter () [TextView|WebView]' +- text: ' (Hallo, ) [FrameLayout|RecyclerView] | (Zurück zur vorherigen Seite) [ImageButton|RecyclerView] + | Du hast () [TextView|RecyclerView] | 9 () [TextView|RecyclerView] | °Punkte. + (Punkte.) [TextView|RecyclerView] | (Weitere Informationen öffnen) [ImageView|RecyclerView] + | Profil vervollständigen () [TextView|RecyclerView] | (Profilvervollständigung + ausblenden – wird nach erneuter Anmeldung wieder eingeblendet.) [ImageView|RecyclerView] + | 29 % () [TextView|RecyclerView] | erledigt () [TextView|RecyclerView] | Klasse, + Du bist auf einem guten Weg. Weiter so! () [TextView|RecyclerView] | Meine Funktionen + () [TextView|RecyclerView] | Meine persönlichen Daten () [TextView|RecyclerView] + | 1 () [TextView|RecyclerView] | Meine Kassenbons () [TextView|RecyclerView] | + Mein PAYBACK () [TextView|RecyclerView] | (Rabattkarte teilen oder drucken) [ViewGroup|RecyclerView] + | Netto plus Karte zum Ausdrucken () [TextView|RecyclerView] | Meine Bezahloptionen + () [TextView|RecyclerView] | Meine Profilvervollständigung () [TextView|RecyclerView] + | 29 % eingerichtet () [TextView|RecyclerView] | Hilfe & Kontakt () [TextView|RecyclerView] + | Meine Einstellungen () [TextView|RecyclerView] | Meine Vorteilswelt () [TextView|RecyclerView] + | Abmelden () [TextView|RecyclerView] | °Punkte sammeln (Punkte sammeln) [TextView|RecyclerView] + | mit attraktiven Coupons () [TextView|RecyclerView] | Weiter () [TextView|RecyclerView]' +- text: Rabatt auf Alles ab 50€ Einkaufswert () [TextView|View] | Zum Online-Shop + () [TextView|View] | Versandkostenfrei () [TextView|View] | Rabatt () [TextView|View] + | ab 50€ MBW auf alles () [TextView|View] | Zum Online-Shop () [TextView|View] + | Online () [TextView|View] | 15€ () [TextView|View] | Rabatt () [TextView|View] + | Rabatt auf Alles ab 150€ Einkaufswert () [TextView|View] | Zum Online-Shop () + [TextView|View] | Online () [TextView|View] | 15% () [TextView|View] | Rabatt + () [TextView|View] | auf die Kategorie Drogerie ohne MBW () [TextView|View] | + Online () [TextView|View] | (Zurück zur vorherigen Seite) [View|View] | (Zurück + zur vorherigen Seite) [View|View] | Zurück () [TextView|View] +metrics: +- accuracy +pipeline_tag: text-classification +library_name: setfit +inference: true +datasets: +- tmp-org/dm-0-0 +base_model: Alibaba-NLP/gte-multilingual-base +--- + +# SetFit with Alibaba-NLP/gte-multilingual-base + +This is a [SetFit](https://github.com/huggingface/setfit) model trained on the [tmp-org/dm-0-0](https://huggingface.co/datasets/tmp-org/dm-0-0) dataset that can be used for Text Classification. This SetFit model uses [Alibaba-NLP/gte-multilingual-base](https://huggingface.co/Alibaba-NLP/gte-multilingual-base) as the Sentence Transformer embedding model. A [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance is used for classification. + +The model has been trained using an efficient few-shot learning technique that involves: + +1. Fine-tuning a [Sentence Transformer](https://www.sbert.net) with contrastive learning. +2. Training a classification head with features from the fine-tuned Sentence Transformer. + +## Model Details + +### Model Description +- **Model Type:** SetFit +- **Sentence Transformer body:** [Alibaba-NLP/gte-multilingual-base](https://huggingface.co/Alibaba-NLP/gte-multilingual-base) +- **Classification head:** a [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance +- **Maximum Sequence Length:** 8192 tokens +- **Number of Classes:** 27 classes +- **Training Dataset:** [tmp-org/dm-0-0](https://huggingface.co/datasets/tmp-org/dm-0-0) + + + +### Model Sources + +- **Repository:** [SetFit on GitHub](https://github.com/huggingface/setfit) +- **Paper:** [Efficient Few-Shot Learning Without Prompts](https://arxiv.org/abs/2209.11055) +- **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit) + +### Model Labels +| Label | Examples | +|:---------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| Startseite_Startseite | | +| Coupons_Coupons | | +| Angebote_Angebote | | +| Online-Shop_Online-Shop | | +| Other_Loading | | +| Karte + Zahlen_Loading | | +| Angebote_Loading | | +| Karte + Zahlen_Coupons | | +| Other_Neuigkeiten | | +| Other_Gewinnspiel | | +| Other_Meine Funktionen | | +| Other_Prospekte | | +| Karte + Zahlen_Nur Karte | | +| Other_Angebote details | | +| Startseite_Loading | | +| Other_Mein PAYBACK | | +| Other_Einkaufsliste | | +| Other_Adventskalender | | +| Other_Meine digitalen Kassenbons | | +| Other_Information | | +| Other_Coupon details | | +| Karte + Zahlen_Karte + Zahlen | | +| Online-Shop_Loading | | +| Other_Rezepte | | +| Other_Unknown | | +| Other_Code einlösen | | +| Coupons_Loading | | + +## Uses + +### Direct Use for Inference + +First install the SetFit library: + +```bash +pip install setfit +``` + +Then you can load this model and run inference. + +```python +from setfit import SetFitModel + +# Download from the 🤗 Hub +model = SetFitModel.from_pretrained("tmp-org/dm_v1") +# Run inference +preds = model(" (Karte + Zahlen) [LinearLayout|WebView]") +``` + + + + + + + + + +## Training Details + +### Training Set Metrics +| Training set | Min | Median | Max | +|:-------------|:----|:---------|:-----| +| Word count | 3 | 187.2463 | 8088 | + +| Label | Training Sample Count | +|:---------------------------------|:----------------------| +| Angebote_Angebote | 32 | +| Angebote_Loading | 8 | +| Coupons_Coupons | 32 | +| Coupons_Loading | 3 | +| Karte + Zahlen_Coupons | 32 | +| Karte + Zahlen_Karte + Zahlen | 11 | +| Karte + Zahlen_Loading | 8 | +| Karte + Zahlen_Nur Karte | 22 | +| Online-Shop_Loading | 7 | +| Online-Shop_Online-Shop | 32 | +| Other_Adventskalender | 4 | +| Other_Angebote details | 29 | +| Other_Code einlösen | 1 | +| Other_Coupon details | 5 | +| Other_Einkaufsliste | 32 | +| Other_Gewinnspiel | 2 | +| Other_Information | 5 | +| Other_Loading | 10 | +| Other_Mein PAYBACK | 12 | +| Other_Meine Funktionen | 15 | +| Other_Meine digitalen Kassenbons | 6 | +| Other_Neuigkeiten | 18 | +| Other_Prospekte | 16 | +| Other_Rezepte | 31 | +| Other_Unknown | 1 | +| Startseite_Loading | 4 | +| Startseite_Startseite | 32 | + +### Training Hyperparameters +- batch_size: (4, 4) +- num_epochs: (1, 1) +- max_steps: -1 +- sampling_strategy: undersampling +- body_learning_rate: (2e-05, 1e-05) +- head_learning_rate: 0.01 +- loss: CosineSimilarityLoss +- distance_metric: cosine_distance +- margin: 0.25 +- end_to_end: False +- use_amp: False +- warmup_proportion: 0.1 +- l2_weight: 0.01 +- seed: 4242 +- eval_max_steps: -1 +- load_best_model_at_end: False + +### Training Results +| Epoch | Step | Training Loss | Validation Loss | +|:------:|:----:|:-------------:|:---------------:| +| 0.0004 | 1 | 0.0378 | - | +| 0.0194 | 50 | 0.2267 | - | +| 0.0388 | 100 | 0.1703 | - | +| 0.0581 | 150 | 0.1404 | - | +| 0.0775 | 200 | 0.1242 | - | +| 0.0969 | 250 | 0.1205 | - | +| 0.1163 | 300 | 0.0991 | - | +| 0.1357 | 350 | 0.0971 | - | +| 0.1550 | 400 | 0.101 | - | +| 0.1744 | 450 | 0.0906 | - | +| 0.1938 | 500 | 0.0773 | - | +| 0.2132 | 550 | 0.08 | - | +| 0.2326 | 600 | 0.0352 | - | +| 0.2519 | 650 | 0.0546 | - | +| 0.2713 | 700 | 0.0746 | - | +| 0.2907 | 750 | 0.0858 | - | +| 0.3101 | 800 | 0.0558 | - | +| 0.3295 | 850 | 0.049 | - | +| 0.3488 | 900 | 0.0515 | - | +| 0.3682 | 950 | 0.0526 | - | +| 0.3876 | 1000 | 0.0563 | - | +| 0.4070 | 1050 | 0.0566 | - | +| 0.4264 | 1100 | 0.0326 | - | +| 0.4457 | 1150 | 0.0432 | - | +| 0.4651 | 1200 | 0.0675 | - | +| 0.4845 | 1250 | 0.0548 | - | +| 0.5039 | 1300 | 0.0304 | - | +| 0.5233 | 1350 | 0.043 | - | +| 0.5426 | 1400 | 0.0412 | - | +| 0.5620 | 1450 | 0.0401 | - | +| 0.5814 | 1500 | 0.0492 | - | +| 0.6008 | 1550 | 0.0355 | - | +| 0.6202 | 1600 | 0.0349 | - | +| 0.6395 | 1650 | 0.0372 | - | +| 0.6589 | 1700 | 0.035 | - | +| 0.6783 | 1750 | 0.0336 | - | +| 0.6977 | 1800 | 0.0214 | - | +| 0.7171 | 1850 | 0.0272 | - | +| 0.7364 | 1900 | 0.0291 | - | +| 0.7558 | 1950 | 0.0266 | - | +| 0.7752 | 2000 | 0.0464 | - | +| 0.7946 | 2050 | 0.0265 | - | +| 0.8140 | 2100 | 0.0187 | - | +| 0.8333 | 2150 | 0.027 | - | +| 0.8527 | 2200 | 0.0376 | - | +| 0.8721 | 2250 | 0.0276 | - | +| 0.8915 | 2300 | 0.0306 | - | +| 0.9109 | 2350 | 0.0201 | - | +| 0.9302 | 2400 | 0.0259 | - | +| 0.9496 | 2450 | 0.035 | - | +| 0.9690 | 2500 | 0.0351 | - | +| 0.9884 | 2550 | 0.0284 | - | + +### Framework Versions +- Python: 3.12.6 +- SetFit: 1.1.2 +- Sentence Transformers: 5.2.2 +- Transformers: 4.57.1 +- PyTorch: 2.10.0+cu128 +- Datasets: 3.6.0 +- Tokenizers: 0.22.2 + +## Citation + +### BibTeX +```bibtex +@article{https://doi.org/10.48550/arxiv.2209.11055, + doi = {10.48550/ARXIV.2209.11055}, + url = {https://arxiv.org/abs/2209.11055}, + author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren}, + keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences}, + title = {Efficient Few-Shot Learning Without Prompts}, + publisher = {arXiv}, + year = {2022}, + copyright = {Creative Commons Attribution 4.0 International} +} +``` + + + + + + \ No newline at end of file