leavoigt's picture
Add SetFit model
b8ab5ea verified
metadata
library_name: setfit
tags:
  - setfit
  - sentence-transformers
  - text-classification
  - generated_from_setfit_trainer
metrics:
  - accuracy
widget:
  - text: >-
      Implementing the reform required strong support from all ministries
      involved. A major effort was required to present the conceptual change to
      car importers, politicians and the public. A great deal was also invested
      in public relations to describe the benefits of the tax, which by many was
      perceived as yet another attempt to increase tax revenues. A number of the
      most popular car models’ prices were about to increase – mostly large
      family, luxury and sport cars – but for many models, the retail price was
      actually reduced.
  - text: >-
      Workers in the formal sector. Formal sector workers also face economic
      risks. A number of them experience income instability due to
      contractualization, retrenchment, and firm closures. In 2014, contractual
      workers accounted for 22 percent of the total 4.5 million workers employed
      in establishments with 20 or more employees.
  - text: >-
      Building additional dams and power stations to further develop energy
      generation potential from the same river flow as well as develop new dam
      sites on parallel rivers in order to maintain the baseline hydropower
      electricity generation capacity to levels attainable under a ‘no-climate
      change’ scenario. Developing and implementing climate change compatible
      building/construction codes for buildings, roads, airports, airfields, dry
      ports, railways, bridges, dams and irrigation canals that are safe for
      human life and minimize economic damage that is likely to result from
      increasing extremes in flooding.
  - text: >-
      Another factor that increases farmer vulnerability is the remoteness of
      farm villages and lack of adequate road infrastructure. Across the three
      regions, roads are in a poor state and unevenly distributed, with many
      villages lacking roads that connect them to other villages. Even the main
      roads are often accessible only during the dry season. The livelihood
      implications of this isolation are significant, as farmers have
      difficulties getting their products to markets as well as obtaining
      agricultural inputs; in addition, farmers generally have to pay higher
      prices for agricultural inputs in remote areas, reducing their profit
      margins
  - text: "This project aims to construct a desalination plant in the capital city in order to respond directly to drinking water supply needs. This new plant, which will have a capacity of 22,500 m3\_daily, easily expandable to 45,000 m3, will be fuelled by renewable energy, which is expected to be provided by a wind farm planned for the second phase of the project. Funding:\_European Union. Rural Community Development and Water Mobilization Project (PRODERMO)."
pipeline_tag: text-classification
inference: false
base_model: sentence-transformers/all-mpnet-base-v2

SetFit with sentence-transformers/all-mpnet-base-v2

This is a SetFit model that can be used for Text Classification. This SetFit model uses sentence-transformers/all-mpnet-base-v2 as the Sentence Transformer embedding model. A SetFitHead instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Sources

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("leavoigt/vulnerability_multilabel_updated")
# Run inference
preds = model("Workers in the formal sector. Formal sector workers also face economic risks. A number of them experience income instability due to contractualization, retrenchment, and firm closures. In 2014, contractual workers accounted for 22 percent of the total 4.5 million workers employed in establishments with 20 or more employees.")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 21 72.6472 238

Training Hyperparameters

  • batch_size: (16, 2)
  • num_epochs: (1, 0)
  • max_steps: -1
  • sampling_strategy: undersampling
  • body_learning_rate: (2e-05, 1e-05)
  • head_learning_rate: 0.01
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.25
  • end_to_end: False
  • use_amp: False
  • warmup_proportion: 0.01
  • seed: 42
  • eval_max_steps: -1
  • load_best_model_at_end: False

Training Results

Epoch Step Training Loss Validation Loss
0.0006 1 0.1906 -
0.0316 50 0.1275 0.1394
0.0631 100 0.0851 0.1247
0.0947 150 0.0959 0.1269
0.1263 200 0.1109 0.1179
0.1578 250 0.0923 0.1354
0.1894 300 0.063 0.1292
0.2210 350 0.0555 0.1326
0.2525 400 0.0362 0.1127
0.2841 450 0.0582 0.132
0.3157 500 0.0952 0.1339
0.3472 550 0.0793 0.1171
0.3788 600 0.059 0.1187
0.4104 650 0.0373 0.1131
0.4419 700 0.0593 0.1144
0.4735 750 0.0405 0.1174
0.5051 800 0.0284 0.1196
0.5366 850 0.0329 0.1116
0.5682 900 0.0895 0.1193
0.5997 950 0.0576 0.1159
0.6313 1000 0.0385 0.1203
0.6629 1050 0.0842 0.1195
0.6944 1100 0.0274 0.113
0.7260 1150 0.0226 0.1137
0.7576 1200 0.0276 0.1204
0.7891 1250 0.0355 0.1163
0.8207 1300 0.077 0.1161
0.8523 1350 0.0735 0.1135
0.8838 1400 0.0357 0.1175
0.9154 1450 0.0313 0.1207
0.9470 1500 0.0241 0.1159
0.9785 1550 0.0339 0.1161

Framework Versions

  • Python: 3.10.12
  • SetFit: 1.0.3
  • Sentence Transformers: 2.3.1
  • Transformers: 4.38.1
  • PyTorch: 2.1.0+cu121
  • Datasets: 2.3.0
  • Tokenizers: 0.15.2

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}