Edit model card

vi_gpt_poem_

This model is a fine-tuned version of NlpHUST/gpt-neo-vi-small on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.1334

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 42
  • eval_batch_size: 42
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 250
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
6.8634 3.9683 500 6.1900
4.7999 7.9365 1000 3.4039
2.7473 11.9048 1500 2.5766
2.2513 15.8730 2000 2.2051
1.9426 19.8413 2500 1.9113
1.7059 23.8095 3000 1.6723
1.5333 27.7778 3500 1.5196
1.3996 31.7460 4000 1.4060
1.3066 35.7143 4500 1.3193
1.228 39.6825 5000 1.2513
1.1642 43.6508 5500 1.2000
1.1191 47.6190 6000 1.1607
1.0825 51.5873 6500 1.1295
1.0483 55.5556 7000 1.1036
1.0203 59.5238 7500 1.0818
0.9967 63.4921 8000 1.0631
0.9745 67.4603 8500 1.0471
0.9552 71.4286 9000 1.0332
0.9362 75.3968 9500 1.0208
0.9165 79.3651 10000 1.0098
0.8977 83.3333 10500 1.0002
0.8846 87.3016 11000 0.9915
0.8641 91.2698 11500 0.9838
0.8478 95.2381 12000 0.9779
0.8286 99.2063 12500 0.9721
0.811 103.1746 13000 0.9677
0.7916 107.1429 13500 0.9644
0.7721 111.1111 14000 0.9625
0.7513 115.0794 14500 0.9616
0.7292 119.0476 15000 0.9617
0.7066 123.0159 15500 0.9622
0.683 126.9841 16000 0.9639
0.6582 130.9524 16500 0.9661
0.632 134.9206 17000 0.9690
0.6047 138.8889 17500 0.9727
0.5769 142.8571 18000 0.9763
0.548 146.8254 18500 0.9802
0.5169 150.7937 19000 0.9844
0.4863 154.7619 19500 0.9887
0.4536 158.7302 20000 0.9936
0.4223 162.6984 20500 0.9975
0.3891 166.6667 21000 1.0022
0.3571 170.6349 21500 1.0071
0.3256 174.6032 22000 1.0118
0.2946 178.5714 22500 1.0164
0.2642 182.5397 23000 1.0221
0.2345 186.5079 23500 1.0271
0.2069 190.4762 24000 1.0331
0.1806 194.4444 24500 1.0393
0.1565 198.4127 25000 1.0462
0.1351 202.3810 25500 1.0527
0.1153 206.3492 26000 1.0605
0.0984 210.3175 26500 1.0679
0.0842 214.2857 27000 1.0758
0.0721 218.2540 27500 1.0827
0.0627 222.2222 28000 1.0906
0.0555 226.1905 28500 1.0978
0.0495 230.1587 29000 1.1043
0.045 234.1270 29500 1.1107
0.0412 238.0952 30000 1.1166
0.0382 242.0635 30500 1.1228
0.0356 246.0317 31000 1.1275
0.0335 250.0 31500 1.1334

Framework versions

  • Transformers 4.40.0
  • Pytorch 2.1.2
  • Datasets 2.16.1
  • Tokenizers 0.19.1
Downloads last month
61
Safetensors
Model size
133M params
Tensor type
F32
·

Finetuned from