Edit model card

50000usd

This model is a fine-tuned version of gpt2 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 6.5760

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0005
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • num_epochs: 50
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
No log 0.8 1 6.9760
No log 1.6 2 6.6552
No log 2.4 3 6.5902
No log 4.0 5 6.5775
No log 4.8 6 6.5818
No log 5.6 7 6.5848
No log 6.4 8 6.5881
6.3782 8.0 10 6.5849
6.3782 8.8 11 6.5811
6.3782 9.6 12 6.5751
6.3782 10.4 13 6.5729
6.3782 12.0 15 6.5785
6.3782 12.8 16 6.5804
6.3782 13.6 17 6.5819
6.3782 14.4 18 6.5836
6.2143 16.0 20 6.5851
6.2143 16.8 21 6.5833
6.2143 17.6 22 6.5813
6.2143 18.4 23 6.5784
6.2143 20.0 25 6.5766
6.2143 20.8 26 6.5764
6.2143 21.6 27 6.5765
6.2143 22.4 28 6.5762
6.2074 24.0 30 6.5763
6.2074 24.8 31 6.5761
6.2074 25.6 32 6.5763
6.2074 26.4 33 6.5757
6.2074 28.0 35 6.5758
6.2074 28.8 36 6.5754
6.2074 29.6 37 6.5760
6.2074 30.4 38 6.5763
6.2047 32.0 40 6.5760
6.2047 32.8 41 6.5764
6.2047 33.6 42 6.5755
6.2047 34.4 43 6.5758
6.2047 36.0 45 6.5759
6.2047 36.8 46 6.5760
6.2047 37.6 47 6.5760
6.2047 38.4 48 6.5761
6.2026 40.0 50 6.5760

Framework versions

  • Transformers 4.38.2
  • Pytorch 2.1.0+cu121
  • Datasets 2.18.0
  • Tokenizers 0.15.2
Downloads last month
8
Safetensors
Model size
87M params
Tensor type
F32
·

Finetuned from