Edit model card

roberta-base-nsp-2000-1e-06-8

This model is a fine-tuned version of roberta-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3459

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-06
  • train_batch_size: 32
  • eval_batch_size: 1024
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 40

Training results

Training Loss Epoch Step Validation Loss
No log 1.0 63 0.6949
No log 2.0 126 0.6931
No log 3.0 189 0.6922
0.6959 4.0 252 0.6914
0.6959 5.0 315 0.6903
0.6959 6.0 378 0.6882
0.6919 7.0 441 0.6848
0.6919 8.0 504 0.6793
0.6919 9.0 567 0.6689
0.6783 10.0 630 0.6529
0.6783 11.0 693 0.6351
0.6783 12.0 756 0.6112
0.6353 13.0 819 0.5895
0.6353 14.0 882 0.5650
0.6353 15.0 945 0.5421
0.5772 16.0 1008 0.5122
0.5772 17.0 1071 0.4856
0.5772 18.0 1134 0.4610
0.5772 19.0 1197 0.4414
0.4996 20.0 1260 0.4220
0.4996 21.0 1323 0.4072
0.4996 22.0 1386 0.3954
0.4297 23.0 1449 0.3844
0.4297 24.0 1512 0.3740
0.4297 25.0 1575 0.3714
0.3637 26.0 1638 0.3714
0.3637 27.0 1701 0.3612
0.3637 28.0 1764 0.3551
0.3247 29.0 1827 0.3522
0.3247 30.0 1890 0.3506
0.3247 31.0 1953 0.3482
0.2962 32.0 2016 0.3475
0.2962 33.0 2079 0.3454
0.2962 34.0 2142 0.3461
0.278 35.0 2205 0.3445
0.278 36.0 2268 0.3448
0.278 37.0 2331 0.3464
0.278 38.0 2394 0.3459
0.2662 39.0 2457 0.3462
0.2662 40.0 2520 0.3459

Framework versions

  • Transformers 4.40.2
  • Pytorch 2.3.0+cu121
  • Datasets 2.19.1
  • Tokenizers 0.19.1
Downloads last month
2
Safetensors
Model size
125M params
Tensor type
F32
·
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Finetuned from