judy93536's picture
End of training
8e91ad8
metadata
license: apache-2.0
base_model: distilroberta-base
tags:
  - generated_from_trainer
model-index:
  - name: distilroberta-rbm231k-ep20-op40
    results: []

distilroberta-rbm231k-ep20-op40

This model is a fine-tuned version of distilroberta-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.1240

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 7.3e-05
  • train_batch_size: 64
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.19
  • num_epochs: 40
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
1.696 1.0 14644 1.5458
1.5919 2.0 29288 1.4527
1.5345 3.0 43932 1.4052
1.4921 4.0 58576 1.3763
1.4798 5.0 73220 1.3612
1.464 6.0 87864 1.3492
1.4498 7.0 102508 1.3409
1.448 8.0 117152 1.3355
1.4262 9.0 131796 1.3213
1.4175 10.0 146440 1.3096
1.3851 11.0 161084 1.2963
1.3728 12.0 175728 1.2846
1.3601 13.0 190372 1.2739
1.351 14.0 205016 1.2639
1.3406 15.0 219660 1.2555
1.3214 16.0 234304 1.2454
1.31 17.0 248948 1.2372
1.3117 18.0 263592 1.2317
1.2947 19.0 278236 1.2267
1.2858 20.0 292880 1.2162
1.2804 21.0 307524 1.2092
1.2708 22.0 322168 1.2064
1.2635 23.0 336812 1.1974
1.253 24.0 351456 1.1926
1.2463 25.0 366100 1.1832
1.2399 26.0 380744 1.1817
1.2328 27.0 395388 1.1752
1.2292 28.0 410032 1.1710
1.2197 29.0 424676 1.1672
1.2101 30.0 439320 1.1618
1.2029 31.0 453964 1.1551
1.2007 32.0 468608 1.1515
1.1932 33.0 483252 1.1438
1.1884 34.0 497896 1.1406
1.1835 35.0 512540 1.1364
1.1762 36.0 527184 1.1344
1.174 37.0 541828 1.1315
1.1675 38.0 556472 1.1267
1.1699 39.0 571116 1.1230
1.1629 40.0 585760 1.1274

Framework versions

  • Transformers 4.35.2
  • Pytorch 2.1.0+cu118
  • Datasets 2.15.0
  • Tokenizers 0.15.0