sbert_large_nlu_ru_pos

This model is a fine-tuned version of ai-forever/sbert_large_nlu_ru on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4870
  • Precision: 0.5717
  • Recall: 0.605
  • F1: 0.5879
  • Accuracy: 0.9001

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss Precision Recall F1 Accuracy
No log 1.09 50 0.6457 0.0 0.0 0.0 0.7571
No log 2.17 100 0.5343 0.0458 0.0463 0.0461 0.7998
No log 3.26 150 0.3732 0.1121 0.1486 0.1278 0.8512
No log 4.35 200 0.3237 0.2713 0.3436 0.3032 0.8778
No log 5.43 250 0.2921 0.3412 0.4189 0.3761 0.8935
No log 6.52 300 0.2778 0.4079 0.5386 0.4642 0.9011
No log 7.61 350 0.2989 0.4301 0.4807 0.4540 0.9012
No log 8.7 400 0.2617 0.4489 0.5676 0.5013 0.9083
No log 9.78 450 0.3645 0.4661 0.5174 0.4904 0.9050
0.3288 10.87 500 0.3305 0.5297 0.6023 0.5637 0.9126
0.3288 11.96 550 0.3256 0.5544 0.6004 0.5765 0.9093
0.3288 13.04 600 0.3275 0.4330 0.5927 0.5004 0.9093
0.3288 14.13 650 0.4194 0.5017 0.5618 0.5301 0.9123
0.3288 15.22 700 0.3667 0.5275 0.6100 0.5658 0.9138
0.3288 16.3 750 0.4694 0.5117 0.6351 0.5668 0.9087
0.3288 17.39 800 0.4007 0.5381 0.6139 0.5735 0.9098
0.3288 18.48 850 0.3834 0.5264 0.5965 0.5593 0.9103
0.3288 19.57 900 0.4039 0.5061 0.6371 0.5641 0.9078
0.3288 20.65 950 0.5111 0.5850 0.6042 0.5945 0.9107
0.0507 21.74 1000 0.5454 0.5699 0.5985 0.5838 0.9124
0.0507 22.83 1050 0.4575 0.5668 0.6139 0.5894 0.9148
0.0507 23.91 1100 0.3752 0.5281 0.6178 0.5694 0.9126
0.0507 25.0 1150 0.5141 0.6074 0.6332 0.6200 0.9159
0.0507 26.09 1200 0.4203 0.5464 0.6371 0.5882 0.9134
0.0507 27.17 1250 0.4810 0.5150 0.6313 0.5672 0.9115
0.0507 28.26 1300 0.4972 0.5560 0.5753 0.5655 0.9116
0.0507 29.35 1350 0.6118 0.5439 0.6216 0.5802 0.9127
0.0507 30.43 1400 0.5298 0.4354 0.6371 0.5172 0.8847
0.0507 31.52 1450 0.5129 0.5771 0.6216 0.5985 0.9132
0.0234 32.61 1500 0.5165 0.5395 0.6332 0.5826 0.9068
0.0234 33.7 1550 0.4776 0.5110 0.6255 0.5625 0.9095
0.0234 34.78 1600 0.3794 0.5156 0.6699 0.5827 0.9117
0.0234 35.87 1650 0.4895 0.6074 0.6332 0.6200 0.9165
0.0234 36.96 1700 0.5130 0.6317 0.6158 0.6237 0.9137
0.0234 38.04 1750 0.5138 0.6143 0.6120 0.6132 0.9103
0.0234 39.13 1800 0.5555 0.5579 0.6602 0.6048 0.9044
0.0234 40.22 1850 0.3895 0.5055 0.6197 0.5568 0.9107
0.0234 41.3 1900 0.4607 0.5936 0.6429 0.6172 0.9101
0.0234 42.39 1950 0.3913 0.5654 0.6429 0.6016 0.9091
0.0259 43.48 2000 0.3646 0.5797 0.6602 0.6173 0.9091
0.0259 44.57 2050 0.5094 0.6579 0.6274 0.6423 0.9191
0.0259 45.65 2100 0.4718 0.5996 0.6158 0.6076 0.9124
0.0259 46.74 2150 0.5557 0.5855 0.6409 0.6120 0.9056
0.0259 47.83 2200 0.5481 0.6018 0.6332 0.6171 0.9106
0.0259 48.91 2250 0.5198 0.5535 0.6486 0.5973 0.9104
0.0259 50.0 2300 0.4876 0.6282 0.6197 0.6239 0.9098
0.0259 51.09 2350 0.4904 0.5352 0.5135 0.5241 0.8984
0.0259 52.17 2400 0.4268 0.5639 0.6390 0.5991 0.9080
0.0259 53.26 2450 0.4759 0.5695 0.5772 0.5733 0.9057
0.0221 54.35 2500 0.5927 0.6129 0.5869 0.5996 0.9017
0.0221 55.43 2550 0.4404 0.4917 0.6274 0.5513 0.8964

Framework versions

  • Transformers 4.38.2
  • Pytorch 2.1.2
  • Datasets 2.1.0
  • Tokenizers 0.15.2
Downloads last month
19
Safetensors
Model size
426M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for DimasikKurd/sbert_large_nlu_ru_pos

Finetuned
(4)
this model