sayakpaul's picture
sayakpaul HF staff
Training in progress epoch 29
4184a67
|
raw
history blame
4.14 kB
metadata
license: mit
tags:
  - generated_from_keras_callback
model-index:
  - name: tf-tpu/roberta-base-epochs-500-no-wd
    results: []

tf-tpu/roberta-base-epochs-500-no-wd

This model is a fine-tuned version of roberta-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Train Loss: 2.0789
  • Train Accuracy: 0.0907
  • Validation Loss: 1.8604
  • Validation Accuracy: 0.0958
  • Epoch: 29

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • optimizer: {'name': 'AdamWeightDecay', 'learning_rate': {'class_name': 'WarmUp', 'config': {'initial_learning_rate': 0.0001, 'decay_schedule_fn': {'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 0.0001, 'decay_steps': 278825, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}, 'passive_serialization': True}, 'warmup_steps': 14675, 'power': 1.0, 'name': None}}, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False, 'weight_decay_rate': 0.001}
  • training_precision: mixed_bfloat16

Training results

Train Loss Train Accuracy Validation Loss Validation Accuracy Epoch
8.3284 0.0211 7.1523 0.0266 0
6.3670 0.0318 5.7812 0.0342 1
5.6051 0.0380 5.4414 0.0420 2
5.3602 0.0433 5.2734 0.0432 3
5.2285 0.0444 5.1562 0.0442 4
5.1371 0.0446 5.1133 0.0436 5
5.0673 0.0446 5.0703 0.0442 6
5.0132 0.0447 4.9883 0.0442 7
4.9642 0.0448 4.9219 0.0441 8
4.9217 0.0448 4.9258 0.0440 9
4.8871 0.0448 4.8867 0.0439 10
4.8548 0.0449 4.8672 0.0439 11
4.8277 0.0449 4.8047 0.0445 12
4.8033 0.0449 4.8477 0.0437 13
4.7807 0.0449 4.7617 0.0439 14
4.7592 0.0449 4.7773 0.0437 15
4.7388 0.0449 4.7539 0.0441 16
4.7225 0.0449 4.7266 0.0439 17
4.7052 0.0449 4.6914 0.0450 18
4.6917 0.0449 4.7188 0.0444 19
4.6789 0.0449 4.6914 0.0444 20
4.6689 0.0449 4.7031 0.0439 21
4.6570 0.0449 4.7031 0.0437 22
4.6486 0.0450 4.6758 0.0446 23
4.6393 0.0449 4.6914 0.0441 24
4.5898 0.0449 4.4688 0.0452 25
4.3024 0.0472 3.8730 0.0551 26
3.1689 0.0693 2.4375 0.0835 27
2.3780 0.0844 2.0498 0.0922 28
2.0789 0.0907 1.8604 0.0958 29

Framework versions

  • Transformers 4.27.0.dev0
  • TensorFlow 2.9.1
  • Tokenizers 0.13.2