textgenerator / README.md
srivatsavaasista's picture
add model
2e40148
|
raw
history blame
2.37 kB
metadata
license: mit
tags:
  - generated_from_keras_callback
model-index:
  - name: textgenerator
    results: []

textgenerator

This model is a fine-tuned version of gpt2 on an unknown dataset. It achieves the following results on the evaluation set:

  • Train Loss: 5.7309
  • Validation Loss: 6.3030
  • Epoch: 19

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • optimizer: {'name': 'AdamWeightDecay', 'learning_rate': {'class_name': 'WarmUp', 'config': {'initial_learning_rate': 5e-05, 'decay_schedule_fn': {'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 5e-05, 'decay_steps': -887, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}, 'passive_serialization': True}, 'warmup_steps': 1000, 'power': 1.0, 'name': None}}, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False, 'weight_decay_rate': 0.01}
  • training_precision: mixed_float16

Training results

Train Loss Validation Loss Epoch
10.2937 9.6742 0
9.2524 8.8693 1
8.2666 7.9135 2
7.3757 7.3273 3
6.9147 6.9741 4
6.5844 6.7259 5
6.3340 6.5383 6
6.0966 6.3904 7
5.8915 6.3030 8
5.7314 6.3030 9
5.7268 6.3030 10
5.7300 6.3030 11
5.7283 6.3030 12
5.7314 6.3030 13
5.7284 6.3030 14
5.7323 6.3030 15
5.7304 6.3030 16
5.7292 6.3030 17
5.7311 6.3030 18
5.7309 6.3030 19

Framework versions

  • Transformers 4.21.0
  • TensorFlow 2.8.2
  • Datasets 2.4.0
  • Tokenizers 0.12.1