masked-lm-tpu / README.md
Zemulax's picture
Training in progress epoch 28
9d62c4a
|
raw
history blame
4.02 kB
metadata
license: mit
tags:
  - generated_from_keras_callback
model-index:
  - name: Zemulax/masked-lm-tpu
    results: []

Zemulax/masked-lm-tpu

This model is a fine-tuned version of roberta-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Train Loss: 8.9625
  • Train Accuracy: 0.0229
  • Validation Loss: 8.8969
  • Validation Accuracy: 0.0221
  • Epoch: 28

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • optimizer: {'name': 'AdamWeightDecay', 'learning_rate': {'class_name': 'WarmUp', 'config': {'initial_learning_rate': 0.0001, 'decay_schedule_fn': {'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 0.0001, 'decay_steps': 223250, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}, 'passive_serialization': True}, 'warmup_steps': 11750, 'power': 1.0, 'name': None}}, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False, 'weight_decay_rate': 0.001}
  • training_precision: float32

Training results

Train Loss Train Accuracy Validation Loss Validation Accuracy Epoch
10.2868 0.0 10.2891 0.0 0
10.2817 0.0000 10.2764 0.0 1
10.2772 0.0000 10.2667 0.0000 2
10.2604 0.0000 10.2521 0.0 3
10.2421 0.0000 10.2282 0.0000 4
10.2219 0.0 10.2010 0.0 5
10.1957 0.0 10.1669 0.0 6
10.1667 0.0000 10.1388 0.0000 7
10.1278 0.0000 10.0908 0.0000 8
10.0848 0.0000 10.0405 0.0001 9
10.0496 0.0002 9.9921 0.0007 10
9.9940 0.0010 9.9422 0.0039 11
9.9424 0.0035 9.8765 0.0110 12
9.8826 0.0092 9.8156 0.0182 13
9.8225 0.0155 9.7461 0.0209 14
9.7670 0.0201 9.6768 0.0222 15
9.7065 0.0219 9.6127 0.0222 16
9.6352 0.0227 9.5445 0.0220 17
9.5757 0.0226 9.4795 0.0219 18
9.4894 0.0232 9.3985 0.0222 19
9.4277 0.0234 9.3386 0.0222 20
9.3676 0.0229 9.2753 0.0220 21
9.2980 0.0229 9.2170 0.0219 22
9.2361 0.0233 9.1518 0.0219 23
9.1515 0.0236 9.0827 0.0223 24
9.1171 0.0228 9.0406 0.0218 25
9.0447 0.0234 8.9867 0.0218 26
9.0119 0.0229 8.9307 0.0221 27
8.9625 0.0229 8.8969 0.0221 28

Framework versions

  • Transformers 4.30.1
  • TensorFlow 2.12.0
  • Tokenizers 0.13.3