Edit model card

gpt2-large-coedit

This model is a fine-tuned version of openai-community/gpt2-large on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.9215
  • Rouge1: 0.4818
  • Rouge2: 0.3649
  • Rougel: 0.4555
  • Rougelsum: 0.4643
  • Sacreblue: 19.1714
  • Memory Used: 68475.5
  • Cuda Allocated: 3082.6328
  • Cuda Reserved: 61060.0
  • Ram Usage: 13976.5117
  • Em: 0.0
  • Gen Len: 82.1798

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 150
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 600
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 1
  • num_epochs: 2
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Sacreblue Memory Used Cuda Allocated Cuda Reserved Ram Usage Em Gen Len
0.8724 0.47 50 1.0274 0.4653 0.3509 0.4382 0.4459 19.0412 68475.5 3082.605 61060.0 5708.957 0.0 82.0895
0.7407 0.94 100 0.9499 0.4825 0.3651 0.4557 0.4656 19.2975 68475.5 3082.6152 61060.0 13842.9336 0.0 81.3952
0.6964 1.41 150 0.9318 0.4783 0.3627 0.452 0.4605 19.418 68475.5 3082.6182 61060.0 13958.2773 0.0 81.0295
0.6846 1.88 200 0.9215 0.4818 0.3649 0.4555 0.4643 19.1714 68475.5 3082.6328 61060.0 13976.5117 0.0 82.1798

Framework versions

  • Transformers 4.39.3
  • Pytorch 2.2.2
  • Datasets 2.18.0
  • Tokenizers 0.15.2
Downloads last month
15
Safetensors
Model size
774M params
Tensor type
F32
·

Finetuned from