Edit model card

gpt2-large-finetuned2

This model is a fine-tuned version of gpt2-large on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6494
  • Rouge1: 0.9235
  • Rouge2: 0.9153
  • Rougel: 0.9235
  • Rougelsum: 0.9235
  • Gen Len: 17.061

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 1
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
2.4757 1.0 278 1.5505 0.9312 0.9247 0.9312 0.9312 17.061
1.6314 2.0 556 1.2537 0.929 0.9222 0.929 0.929 17.061
1.3746 3.0 834 1.1054 0.9246 0.917 0.9246 0.9247 17.061
1.225 4.0 1112 1.0012 0.9294 0.9226 0.9293 0.9293 17.061
1.108 5.0 1390 0.9411 0.9253 0.9177 0.9253 0.9253 17.061
1.0381 6.0 1668 0.8901 0.9247 0.9173 0.9247 0.9247 17.061
0.9722 7.0 1946 0.8340 0.9247 0.917 0.9247 0.9247 17.061
0.9134 8.0 2224 0.7975 0.9236 0.9156 0.9236 0.9237 17.061
0.8894 9.0 2502 0.7745 0.9231 0.9158 0.9231 0.9232 17.061
0.8387 10.0 2780 0.7567 0.9212 0.9132 0.9212 0.9212 17.061
0.8224 11.0 3058 0.7374 0.9232 0.9152 0.9232 0.9232 17.061
0.8071 12.0 3336 0.7298 0.9237 0.9158 0.9237 0.9237 17.061
0.7973 13.0 3614 0.7209 0.9238 0.9161 0.9238 0.9238 17.061
0.7715 14.0 3892 0.7217 0.9231 0.915 0.9231 0.9231 17.061
0.771 15.0 4170 0.7085 0.9224 0.9139 0.9224 0.9224 17.061
0.7617 16.0 4448 0.7041 0.9211 0.9123 0.9211 0.9211 17.061
0.7603 17.0 4726 0.7004 0.9226 0.9146 0.9226 0.9227 17.061
0.7539 18.0 5004 0.6976 0.9253 0.9173 0.9252 0.9253 17.061
0.741 19.0 5282 0.6907 0.9229 0.9146 0.9229 0.9229 17.061
0.7422 20.0 5560 0.6898 0.9222 0.9141 0.9222 0.9222 17.061
0.7333 21.0 5838 0.6880 0.9223 0.9138 0.9223 0.9223 17.061
0.7378 22.0 6116 0.6837 0.9222 0.914 0.9222 0.9222 17.061
0.723 23.0 6394 0.6849 0.9225 0.914 0.9225 0.9225 17.061
0.7277 24.0 6672 0.6791 0.9235 0.9148 0.9235 0.9235 17.061
0.7222 25.0 6950 0.6834 0.9267 0.9189 0.9267 0.9267 17.061
0.7235 26.0 7228 0.6749 0.9221 0.9139 0.9221 0.9221 17.061
0.7207 27.0 7506 0.6741 0.9231 0.9149 0.9231 0.9231 17.061
0.7106 28.0 7784 0.6718 0.9224 0.9141 0.9224 0.9224 17.061
0.7086 29.0 8062 0.6706 0.9233 0.9153 0.9233 0.9233 17.061
0.7086 30.0 8340 0.6680 0.9241 0.9161 0.9241 0.9241 17.061
0.7081 31.0 8618 0.6678 0.9257 0.9177 0.9257 0.9257 17.061
0.6977 32.0 8896 0.6651 0.9229 0.9146 0.9229 0.9229 17.061
0.6937 33.0 9174 0.6634 0.9247 0.9167 0.9246 0.9247 17.061
0.6998 34.0 9452 0.6636 0.9243 0.916 0.9243 0.9243 17.061
0.6932 35.0 9730 0.6627 0.9254 0.9175 0.9254 0.9254 17.061
0.6978 36.0 10008 0.6612 0.9236 0.9154 0.9236 0.9236 17.061
0.6881 37.0 10286 0.6612 0.9251 0.9174 0.9251 0.9251 17.061
0.6874 38.0 10564 0.6589 0.9247 0.9167 0.9247 0.9247 17.061
0.6898 39.0 10842 0.6579 0.9235 0.9153 0.9235 0.9235 17.061
0.6857 40.0 11120 0.6568 0.9231 0.915 0.9231 0.9232 17.061
0.6751 41.0 11398 0.6554 0.924 0.9161 0.924 0.924 17.061
0.6782 42.0 11676 0.6547 0.9243 0.9164 0.9243 0.9243 17.061
0.6775 43.0 11954 0.6537 0.9242 0.9162 0.9242 0.9242 17.061
0.6764 44.0 12232 0.6530 0.923 0.9148 0.923 0.923 17.061
0.6741 45.0 12510 0.6524 0.9242 0.9161 0.9242 0.9242 17.061
0.6638 46.0 12788 0.6515 0.9241 0.9159 0.9241 0.9241 17.061
0.6634 47.0 13066 0.6509 0.9242 0.916 0.9242 0.9242 17.061
0.6614 48.0 13344 0.6500 0.9238 0.9156 0.9238 0.9238 17.061
0.6595 49.0 13622 0.6495 0.9236 0.9154 0.9236 0.9236 17.061
0.6541 50.0 13900 0.6494 0.9235 0.9153 0.9235 0.9235 17.061

Framework versions

  • Transformers 4.34.1
  • Pytorch 2.0.1
  • Datasets 2.14.6
  • Tokenizers 0.14.1
Downloads last month
4
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for kowsiknd/gpt2-large-finetuned2

Finetuned
(60)
this model