Edit model card

mpnet-multilabel-sector-classifier

This model is a fine-tuned version of sentence-transformers/all-mpnet-base-v2 on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2273
  • Precision Micro: 0.8075
  • Precision Weighted: 0.8110
  • Precision Samples: 0.8365
  • Recall Micro: 0.8897
  • Recall Weighted: 0.8897
  • Recall Samples: 0.8922
  • F1-score: 0.8464

Model description

This model is trained for performing Multi Label Sector Classification.

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 6.9e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 200
  • num_epochs: 8
  • weight_decay: 0.001
  • gradient_acumulation_steps: 1

Training results

Training Loss Epoch Step Validation Loss Precision Micro Precision Weighted Precision Samples Recall Micro Recall Weighted Recall Samples F1-score
0.4478 1.0 897 0.2277 0.6731 0.7183 0.7460 0.8822 0.8822 0.8989 0.7871
0.2241 2.0 1794 0.1862 0.7088 0.7485 0.7754 0.8933 0.8933 0.9110 0.8108
0.1647 3.0 2691 0.2025 0.6785 0.7023 0.7634 0.9124 0.9124 0.9252 0.8077
0.1232 4.0 3588 0.1839 0.7274 0.7322 0.7976 0.9029 0.9029 0.9134 0.8286
0.0899 5.0 4485 0.1889 0.7919 0.8007 0.8350 0.8909 0.8909 0.9060 0.8483
0.0653 6.0 5382 0.2039 0.7478 0.7544 0.8098 0.8973 0.8973 0.9114 0.8346
0.0462 7.0 6279 0.2149 0.7447 0.7500 0.8060 0.8989 0.8989 0.9107 0.8323
0.0336 8.0 7176 0.2181 0.7733 0.7780 0.8221 0.8909 0.8909 0.9031 0.8400

Environmental Impact

Carbon emissions were estimated using the codecarbon. The carbon emission reported are incluidng the hyperparamter search performed on subset of training data.

  • Hardware Type: 16GB T4
  • Hours used: 3
  • Cloud Provider: Google Colab
  • Carbon Emitted : 0.276132

Framework versions

  • Transformers 4.28.0
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
2
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train ppsingh/mpnet-multilabel-sector-classifier