Edit model card

roberta-base-pp-1000-1e-06-8

This model is a fine-tuned version of roberta-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6920

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-06
  • train_batch_size: 32
  • eval_batch_size: 1024
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 40

Training results

Training Loss Epoch Step Validation Loss
No log 1.0 32 0.6944
No log 2.0 64 0.6938
No log 3.0 96 0.6934
No log 4.0 128 0.6932
No log 5.0 160 0.6931
No log 6.0 192 0.6930
0.6946 7.0 224 0.6930
0.6946 8.0 256 0.6929
0.6946 9.0 288 0.6929
0.6946 10.0 320 0.6929
0.6946 11.0 352 0.6928
0.6946 12.0 384 0.6928
0.6929 13.0 416 0.6928
0.6929 14.0 448 0.6927
0.6929 15.0 480 0.6927
0.6929 16.0 512 0.6926
0.6929 17.0 544 0.6926
0.6929 18.0 576 0.6926
0.6928 19.0 608 0.6926
0.6928 20.0 640 0.6925
0.6928 21.0 672 0.6924
0.6928 22.0 704 0.6924
0.6928 23.0 736 0.6923
0.6928 24.0 768 0.6923
0.6906 25.0 800 0.6923
0.6906 26.0 832 0.6922
0.6906 27.0 864 0.6922
0.6906 28.0 896 0.6921
0.6906 29.0 928 0.6921
0.6906 30.0 960 0.6921
0.6906 31.0 992 0.6921
0.6897 32.0 1024 0.6920
0.6897 33.0 1056 0.6920
0.6897 34.0 1088 0.6920
0.6897 35.0 1120 0.6920
0.6897 36.0 1152 0.6920
0.6897 37.0 1184 0.6920
0.69 38.0 1216 0.6920
0.69 39.0 1248 0.6920
0.69 40.0 1280 0.6920

Framework versions

  • Transformers 4.40.2
  • Pytorch 2.3.0+cu121
  • Datasets 2.19.1
  • Tokenizers 0.19.1
Downloads last month
3
Safetensors
Model size
125M params
Tensor type
F32
·

Finetuned from