masked-lm-tpu / README.md
Zemulax's picture
Training in progress epoch 98
4f2f573
metadata
license: mit
tags:
  - generated_from_keras_callback
model-index:
  - name: Zemulax/masked-lm-tpu
    results: []

Zemulax/masked-lm-tpu

This model is a fine-tuned version of roberta-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Train Loss: 7.7770
  • Train Accuracy: 0.0241
  • Validation Loss: 7.7589
  • Validation Accuracy: 0.0230
  • Epoch: 98

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • optimizer: {'name': 'AdamWeightDecay', 'learning_rate': {'class_name': 'WarmUp', 'config': {'initial_learning_rate': 0.0001, 'decay_schedule_fn': {'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 0.0001, 'decay_steps': 223250, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}, 'passive_serialization': True}, 'warmup_steps': 11750, 'power': 1.0, 'name': None}}, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False, 'weight_decay_rate': 0.001}
  • training_precision: float32

Training results

Train Loss Train Accuracy Validation Loss Validation Accuracy Epoch
10.2868 0.0 10.2891 0.0 0
10.2817 0.0000 10.2764 0.0 1
10.2772 0.0000 10.2667 0.0000 2
10.2604 0.0000 10.2521 0.0 3
10.2421 0.0000 10.2282 0.0000 4
10.2219 0.0 10.2010 0.0 5
10.1957 0.0 10.1669 0.0 6
10.1667 0.0000 10.1388 0.0000 7
10.1278 0.0000 10.0908 0.0000 8
10.0848 0.0000 10.0405 0.0001 9
10.0496 0.0002 9.9921 0.0007 10
9.9940 0.0010 9.9422 0.0039 11
9.9424 0.0035 9.8765 0.0110 12
9.8826 0.0092 9.8156 0.0182 13
9.8225 0.0155 9.7461 0.0209 14
9.7670 0.0201 9.6768 0.0222 15
9.7065 0.0219 9.6127 0.0222 16
9.6352 0.0227 9.5445 0.0220 17
9.5757 0.0226 9.4795 0.0219 18
9.4894 0.0232 9.3985 0.0222 19
9.4277 0.0234 9.3386 0.0222 20
9.3676 0.0229 9.2753 0.0220 21
9.2980 0.0229 9.2170 0.0219 22
9.2361 0.0233 9.1518 0.0219 23
9.1515 0.0236 9.0827 0.0223 24
9.1171 0.0228 9.0406 0.0218 25
9.0447 0.0234 8.9867 0.0218 26
9.0119 0.0229 8.9307 0.0221 27
8.9625 0.0229 8.8969 0.0221 28
8.9098 0.0230 8.8341 0.0223 29
8.8726 0.0227 8.8118 0.0220 30
8.8574 0.0223 8.7910 0.0219 31
8.7798 0.0231 8.7506 0.0221 32
8.7535 0.0231 8.7055 0.0222 33
8.7333 0.0228 8.6801 0.0223 34
8.6985 0.0231 8.6837 0.0220 35
8.6816 0.0229 8.6243 0.0223 36
8.6356 0.0228 8.6323 0.0217 37
8.6392 0.0225 8.5603 0.0225 38
8.5802 0.0233 8.5722 0.0219 39
8.5825 0.0228 8.5548 0.0220 40
8.5625 0.0228 8.5272 0.0220 41
8.5415 0.0228 8.5200 0.0222 42
8.5124 0.0230 8.4787 0.0222 43
8.4999 0.0229 8.4819 0.0218 44
8.4561 0.0235 8.4453 0.0221 45
8.4854 0.0223 8.4378 0.0220 46
8.4367 0.0229 8.4212 0.0222 47
8.4096 0.0232 8.4033 0.0221 48
8.4162 0.0228 8.3869 0.0221 49
8.4005 0.0229 8.3768 0.0218 50
8.3583 0.0235 8.3470 0.0224 51
8.3428 0.0235 8.3540 0.0221 52
8.3491 0.0231 8.3201 0.0225 53
8.3551 0.0231 8.3382 0.0221 54
8.3186 0.0231 8.3136 0.0219 55
8.3139 0.0226 8.2844 0.0222 56
8.3170 0.0229 8.2740 0.0221 57
8.2886 0.0231 8.2485 0.0223 58
8.2648 0.0233 8.2336 0.0223 59
8.2714 0.0225 8.2321 0.0221 60
8.2446 0.0233 8.2135 0.0223 61
8.2303 0.0230 8.1980 0.0223 62
8.2022 0.0237 8.1996 0.0222 63
8.2222 0.0227 8.1822 0.0222 64
8.1690 0.0236 8.2005 0.0220 65
8.1741 0.0233 8.1446 0.0226 66
8.1990 0.0224 8.1586 0.0219 67
8.1395 0.0236 8.1243 0.0225 68
8.1675 0.0229 8.1275 0.0222 69
8.1432 0.0229 8.1374 0.0217 70
8.1197 0.0234 8.1078 0.0221 71
8.1046 0.0232 8.0991 0.0221 72
8.1013 0.0231 8.0794 0.0222 73
8.0887 0.0228 8.0720 0.0221 74
8.0661 0.0233 8.0573 0.0222 75
8.0548 0.0231 8.0313 0.0226 76
8.0307 0.0235 8.0278 0.0222 77
8.0626 0.0226 8.0084 0.0224 78
8.0276 0.0229 8.0099 0.0221 79
8.0213 0.0231 7.9930 0.0222 80
7.9798 0.0237 7.9742 0.0224 81
8.0135 0.0226 7.9857 0.0218 82
7.9500 0.0235 7.9505 0.0223 83
7.9519 0.0234 7.9711 0.0217 84
7.9616 0.0228 7.9288 0.0223 85
7.9803 0.0225 7.8997 0.0226 86
7.9369 0.0227 7.9015 0.0225 87
7.9309 0.0229 7.9010 0.0224 88
7.9367 0.0226 7.8988 0.0220 89
7.8840 0.0230 7.8774 0.0216 90
7.8785 0.0233 7.8527 0.0225 91
7.8998 0.0226 7.8509 0.0219 92
7.8451 0.0232 7.8488 0.0221 93
7.8596 0.0231 7.8310 0.0222 94
7.8434 0.0231 7.8168 0.0229 95
7.7929 0.0238 7.7815 0.0233 96
7.8174 0.0236 7.7857 0.0232 97
7.7770 0.0241 7.7589 0.0230 98

Framework versions

  • Transformers 4.30.1
  • TensorFlow 2.12.0
  • Tokenizers 0.13.3