distilbert_lda_50_v1_stsb

This model is a fine-tuned version of gokulsrinivasagan/distilbert_lda_50_v1 on the GLUE STSB dataset. It achieves the following results on the evaluation set:

  • Loss: 1.3807
  • Pearson: 0.6576
  • Spearmanr: 0.6483
  • Combined Score: 0.6529

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 256
  • eval_batch_size: 256
  • seed: 10
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Pearson Spearmanr Combined Score
2.6455 1.0 23 2.4817 0.0933 0.0791 0.0862
2.0015 2.0 46 2.5968 0.1340 0.1191 0.1265
1.875 3.0 69 2.2990 0.2213 0.2184 0.2198
1.4576 4.0 92 2.3398 0.4446 0.4147 0.4296
1.0618 5.0 115 1.8954 0.5425 0.5250 0.5338
0.7001 6.0 138 1.9422 0.5693 0.5653 0.5673
0.5415 7.0 161 2.0586 0.5778 0.5692 0.5735
0.4005 8.0 184 1.5390 0.6166 0.6056 0.6111
0.3147 9.0 207 1.7006 0.5953 0.5793 0.5873
0.2829 10.0 230 1.4072 0.6480 0.6375 0.6428
0.2308 11.0 253 1.5357 0.6279 0.6134 0.6207
0.1996 12.0 276 1.6600 0.6341 0.6212 0.6276
0.1822 13.0 299 1.5625 0.6463 0.6354 0.6409
0.1646 14.0 322 1.3891 0.6635 0.6566 0.6601
0.1492 15.0 345 1.3807 0.6576 0.6483 0.6529
0.1296 16.0 368 1.4665 0.6426 0.6319 0.6372
0.1306 17.0 391 1.5261 0.6364 0.6213 0.6288
0.1205 18.0 414 1.4115 0.6427 0.6266 0.6347
0.1087 19.0 437 1.5183 0.6400 0.6248 0.6324
0.1072 20.0 460 1.5816 0.6414 0.6310 0.6362

Framework versions

  • Transformers 4.46.3
  • Pytorch 2.2.1+cu118
  • Datasets 2.17.0
  • Tokenizers 0.20.3
Downloads last month
17
Safetensors
Model size
67M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for gokulsrinivasagan/distilbert_lda_50_v1_stsb

Finetuned
(9)
this model

Dataset used to train gokulsrinivasagan/distilbert_lda_50_v1_stsb

Evaluation results