sayakpaul's picture
sayakpaul HF staff
Training in progress epoch 67
15312f9
|
raw
history blame
No virus
7.18 kB
metadata
license: mit
tags:
  - generated_from_keras_callback
model-index:
  - name: tf-tpu/roberta-base-epochs-500-no-wd
    results: []

tf-tpu/roberta-base-epochs-500-no-wd

This model is a fine-tuned version of roberta-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Train Loss: 1.0984
  • Train Accuracy: 0.1121
  • Validation Loss: 1.0366
  • Validation Accuracy: 0.1139
  • Epoch: 67

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • optimizer: {'name': 'AdamWeightDecay', 'learning_rate': {'class_name': 'WarmUp', 'config': {'initial_learning_rate': 0.0001, 'decay_schedule_fn': {'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 0.0001, 'decay_steps': 278825, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}, 'passive_serialization': True}, 'warmup_steps': 14675, 'power': 1.0, 'name': None}}, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False, 'weight_decay_rate': 0.001}
  • training_precision: mixed_bfloat16

Training results

Train Loss Train Accuracy Validation Loss Validation Accuracy Epoch
8.3284 0.0211 7.1523 0.0266 0
6.3670 0.0318 5.7812 0.0342 1
5.6051 0.0380 5.4414 0.0420 2
5.3602 0.0433 5.2734 0.0432 3
5.2285 0.0444 5.1562 0.0442 4
5.1371 0.0446 5.1133 0.0436 5
5.0673 0.0446 5.0703 0.0442 6
5.0132 0.0447 4.9883 0.0442 7
4.9642 0.0448 4.9219 0.0441 8
4.9217 0.0448 4.9258 0.0440 9
4.8871 0.0448 4.8867 0.0439 10
4.8548 0.0449 4.8672 0.0439 11
4.8277 0.0449 4.8047 0.0445 12
4.8033 0.0449 4.8477 0.0437 13
4.7807 0.0449 4.7617 0.0439 14
4.7592 0.0449 4.7773 0.0437 15
4.7388 0.0449 4.7539 0.0441 16
4.7225 0.0449 4.7266 0.0439 17
4.7052 0.0449 4.6914 0.0450 18
4.6917 0.0449 4.7188 0.0444 19
4.6789 0.0449 4.6914 0.0444 20
4.6689 0.0449 4.7031 0.0439 21
4.6570 0.0449 4.7031 0.0437 22
4.6486 0.0450 4.6758 0.0446 23
4.6393 0.0449 4.6914 0.0441 24
4.5898 0.0449 4.4688 0.0452 25
4.3024 0.0472 3.8730 0.0551 26
3.1689 0.0693 2.4375 0.0835 27
2.3780 0.0844 2.0498 0.0922 28
2.0789 0.0907 1.8604 0.0958 29
1.9204 0.0940 1.7549 0.0982 30
1.8162 0.0961 1.6836 0.0983 31
1.7370 0.0978 1.5869 0.1014 32
1.6723 0.0991 1.5381 0.1029 33
1.6215 0.1002 1.5283 0.1015 34
1.5753 0.1012 1.4736 0.1037 35
1.5295 0.1022 1.4238 0.1052 36
1.4944 0.1030 1.4141 0.1059 37
1.4631 0.1037 1.3721 0.1053 38
1.4363 0.1043 1.3467 0.1060 39
1.4098 0.1049 1.3213 0.1076 40
1.3867 0.1054 1.3018 0.1071 41
1.3658 0.1058 1.2832 0.1083 42
1.3469 0.1063 1.2637 0.1081 43
1.3288 0.1067 1.2598 0.1082 44
1.3111 0.1071 1.2334 0.1096 45
1.2962 0.1075 1.2490 0.1084 46
1.2816 0.1078 1.2168 0.1093 47
1.2672 0.1081 1.2070 0.1090 48
1.2537 0.1084 1.1680 0.1106 49
1.2411 0.1087 1.1904 0.1094 50
1.2285 0.1090 1.1709 0.1103 51
1.2180 0.1093 1.1602 0.1122 52
1.2075 0.1095 1.1396 0.1117 53
1.1973 0.1098 1.1191 0.1124 54
1.1876 0.1100 1.1260 0.1123 55
1.1782 0.1102 1.1289 0.1111 56
1.1698 0.1104 1.1211 0.1117 57
1.1596 0.1106 1.0977 0.1125 58
1.1530 0.1108 1.1172 0.1118 59
1.1462 0.1110 1.0703 0.1126 60
1.1370 0.1112 1.0830 0.1140 61
1.1309 0.1113 1.0762 0.1119 62
1.1234 0.1115 1.0625 0.1137 63
1.1162 0.1117 1.0781 0.1127 64
1.1114 0.1118 1.0474 0.1138 65
1.1036 0.1120 1.0703 0.1134 66
1.0984 0.1121 1.0366 0.1139 67

Framework versions

  • Transformers 4.27.0.dev0
  • TensorFlow 2.9.1
  • Tokenizers 0.13.2