Edit model card

mystv0_abbs

This model is a fine-tuned version of gpt2 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.8311

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0005
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • distributed_type: multi-GPU
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 256
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 1000
  • num_epochs: 200

Training results

Training Loss Epoch Step Validation Loss
1.3179 5.12 1000 1.4846
1.0028 10.24 2000 1.3123
0.997 15.36 3000 1.4747
0.9951 20.47 4000 1.5312
0.9936 25.59 5000 1.5280
0.9922 30.71 6000 1.6307
0.9906 35.83 7000 1.5866
0.9907 40.95 8000 1.5732
0.989 46.07 9000 1.4607
0.9883 51.18 10000 1.4982
0.9875 56.3 11000 1.4993
0.9865 61.42 12000 1.6250
0.9859 66.54 13000 1.5613
0.9847 71.66 14000 1.7067
0.9841 76.78 15000 1.6076
0.9833 81.89 16000 1.6163
0.9824 87.01 17000 1.6662
0.9813 92.13 18000 1.4881
0.9806 97.25 19000 1.5930
0.9799 102.37 20000 1.6575
0.9789 107.49 21000 1.7150
0.9778 112.6 22000 1.6655
0.9771 117.72 23000 1.7227
0.976 122.84 24000 1.6713
0.9749 127.96 25000 1.7268
0.9738 133.08 26000 1.6831
0.9725 138.2 27000 1.7084
0.9714 143.31 28000 1.6809
0.9697 148.43 29000 1.7715
0.9683 153.55 30000 1.7643
0.9668 158.67 31000 1.8609
0.9654 163.79 32000 1.8082
0.964 168.91 33000 1.8623
0.9627 174.02 34000 1.8146
0.9617 179.14 35000 1.8173
0.9609 184.26 36000 1.8237
0.9604 189.38 37000 1.8314
0.9601 194.5 38000 1.8316
0.96 199.62 39000 1.8311

Framework versions

  • Transformers 4.30.2
  • Pytorch 2.1.0+cu121
  • Datasets 2.13.1
  • Tokenizers 0.13.3
Downloads last month
24
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.