sayakpaul's picture
sayakpaul HF staff
Training in progress epoch 98
e4f29cc
|
raw
history blame
No virus
9.66 kB
metadata
license: mit
tags:
  - generated_from_keras_callback
model-index:
  - name: tf-tpu/roberta-base-epochs-500-no-wd
    results: []

tf-tpu/roberta-base-epochs-500-no-wd

This model is a fine-tuned version of roberta-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Train Loss: 0.9740
  • Train Accuracy: 0.1152
  • Validation Loss: 0.9512
  • Validation Accuracy: 0.1169
  • Epoch: 98

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • optimizer: {'name': 'AdamWeightDecay', 'learning_rate': {'class_name': 'WarmUp', 'config': {'initial_learning_rate': 0.0001, 'decay_schedule_fn': {'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 0.0001, 'decay_steps': 278825, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}, 'passive_serialization': True}, 'warmup_steps': 14675, 'power': 1.0, 'name': None}}, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False, 'weight_decay_rate': 0.001}
  • training_precision: mixed_bfloat16

Training results

Train Loss Train Accuracy Validation Loss Validation Accuracy Epoch
8.3284 0.0211 7.1523 0.0266 0
6.3670 0.0318 5.7812 0.0342 1
5.6051 0.0380 5.4414 0.0420 2
5.3602 0.0433 5.2734 0.0432 3
5.2285 0.0444 5.1562 0.0442 4
5.1371 0.0446 5.1133 0.0436 5
5.0673 0.0446 5.0703 0.0442 6
5.0132 0.0447 4.9883 0.0442 7
4.9642 0.0448 4.9219 0.0441 8
4.9217 0.0448 4.9258 0.0440 9
4.8871 0.0448 4.8867 0.0439 10
4.8548 0.0449 4.8672 0.0439 11
4.8277 0.0449 4.8047 0.0445 12
4.8033 0.0449 4.8477 0.0437 13
4.7807 0.0449 4.7617 0.0439 14
4.7592 0.0449 4.7773 0.0437 15
4.7388 0.0449 4.7539 0.0441 16
4.7225 0.0449 4.7266 0.0439 17
4.7052 0.0449 4.6914 0.0450 18
4.6917 0.0449 4.7188 0.0444 19
4.6789 0.0449 4.6914 0.0444 20
4.6689 0.0449 4.7031 0.0439 21
4.6570 0.0449 4.7031 0.0437 22
4.6486 0.0450 4.6758 0.0446 23
4.6393 0.0449 4.6914 0.0441 24
4.5898 0.0449 4.4688 0.0452 25
4.3024 0.0472 3.8730 0.0551 26
3.1689 0.0693 2.4375 0.0835 27
2.3780 0.0844 2.0498 0.0922 28
2.0789 0.0907 1.8604 0.0958 29
1.9204 0.0940 1.7549 0.0982 30
1.8162 0.0961 1.6836 0.0983 31
1.7370 0.0978 1.5869 0.1014 32
1.6723 0.0991 1.5381 0.1029 33
1.6215 0.1002 1.5283 0.1015 34
1.5753 0.1012 1.4736 0.1037 35
1.5295 0.1022 1.4238 0.1052 36
1.4944 0.1030 1.4141 0.1059 37
1.4631 0.1037 1.3721 0.1053 38
1.4363 0.1043 1.3467 0.1060 39
1.4098 0.1049 1.3213 0.1076 40
1.3867 0.1054 1.3018 0.1071 41
1.3658 0.1058 1.2832 0.1083 42
1.3469 0.1063 1.2637 0.1081 43
1.3288 0.1067 1.2598 0.1082 44
1.3111 0.1071 1.2334 0.1096 45
1.2962 0.1075 1.2490 0.1084 46
1.2816 0.1078 1.2168 0.1093 47
1.2672 0.1081 1.2070 0.1090 48
1.2537 0.1084 1.1680 0.1106 49
1.2411 0.1087 1.1904 0.1094 50
1.2285 0.1090 1.1709 0.1103 51
1.2180 0.1093 1.1602 0.1122 52
1.2075 0.1095 1.1396 0.1117 53
1.1973 0.1098 1.1191 0.1124 54
1.1876 0.1100 1.1260 0.1123 55
1.1782 0.1102 1.1289 0.1111 56
1.1698 0.1104 1.1211 0.1117 57
1.1596 0.1106 1.0977 0.1125 58
1.1530 0.1108 1.1172 0.1118 59
1.1462 0.1110 1.0703 0.1126 60
1.1370 0.1112 1.0830 0.1140 61
1.1309 0.1113 1.0762 0.1119 62
1.1234 0.1115 1.0625 0.1137 63
1.1162 0.1117 1.0781 0.1127 64
1.1114 0.1118 1.0474 0.1138 65
1.1036 0.1120 1.0703 0.1134 66
1.0984 0.1121 1.0366 0.1139 67
1.0931 0.1122 1.0513 0.1134 68
1.0860 0.1124 1.0264 0.1137 69
1.0807 0.1126 1.0215 0.1148 70
1.0758 0.1127 1.0269 0.1143 71
1.0704 0.1129 1.0356 0.1141 72
1.0656 0.1129 1.0195 0.1144 73
1.0607 0.1131 1.0093 0.1146 74
1.0559 0.1132 0.9956 0.1155 75
1.0517 0.1133 0.9995 0.1139 76
1.0462 0.1134 0.9839 0.1151 77
1.0422 0.1135 0.9868 0.1153 78
1.0372 0.1137 0.9995 0.1151 79
1.0340 0.1137 1.0059 0.1153 80
1.0296 0.1138 0.9961 0.1152 81
1.0272 0.1138 1.0132 0.1138 82
1.0211 0.1140 0.9575 0.1150 83
1.0182 0.1141 0.9868 0.1150 84
1.0146 0.1142 0.9678 0.1164 85
1.0111 0.1143 0.9839 0.1161 86
1.0083 0.1144 0.9722 0.1162 87
1.0039 0.1144 0.9619 0.1167 88
1.0017 0.1145 0.9575 0.1151 89
0.9973 0.1146 0.9624 0.1149 90
0.9947 0.1147 0.9570 0.1157 91
0.9921 0.1148 0.9360 0.1166 92
0.9884 0.1149 0.9546 0.1156 93
0.9851 0.1149 0.9536 0.1149 94
0.9829 0.1150 0.9575 0.1163 95
0.9795 0.1151 0.9561 0.1156 96
0.9773 0.1151 0.9438 0.1163 97
0.9740 0.1152 0.9512 0.1169 98

Framework versions

  • Transformers 4.27.0.dev0
  • TensorFlow 2.9.1
  • Tokenizers 0.13.2