Edit model card

gpt-neo-1.3B-vietnamese-news_model_kaggle_test

This model is a fine-tuned version of VietAI/gpt-neo-1.3B-vietnamese-news on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1164

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 4e-05
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss
No log 1.0 15 0.9860
No log 2.0 30 0.8441
No log 3.0 45 0.7381
No log 4.0 60 0.6412
No log 5.0 75 0.5607
No log 6.0 90 0.5032
No log 7.0 105 0.4551
No log 8.0 120 0.4227
No log 9.0 135 0.3894
No log 10.0 150 0.3648
No log 11.0 165 0.3439
No log 12.0 180 0.3260
No log 13.0 195 0.3146
No log 14.0 210 0.3017
No log 15.0 225 0.2841
No log 16.0 240 0.2750
No log 17.0 255 0.2653
No log 18.0 270 0.2546
No log 19.0 285 0.2481
No log 20.0 300 0.2403
No log 21.0 315 0.2352
No log 22.0 330 0.2270
No log 23.0 345 0.2188
No log 24.0 360 0.2139
No log 25.0 375 0.2144
No log 26.0 390 0.2045
No log 27.0 405 0.2012
No log 28.0 420 0.1979
No log 29.0 435 0.1940
No log 30.0 450 0.1896
No log 31.0 465 0.1846
No log 32.0 480 0.1793
No log 33.0 495 0.1774
0.3784 34.0 510 0.1764
0.3784 35.0 525 0.1727
0.3784 36.0 540 0.1692
0.3784 37.0 555 0.1720
0.3784 38.0 570 0.1644
0.3784 39.0 585 0.1625
0.3784 40.0 600 0.1671
0.3784 41.0 615 0.1612
0.3784 42.0 630 0.1587
0.3784 43.0 645 0.1556
0.3784 44.0 660 0.1557
0.3784 45.0 675 0.1553
0.3784 46.0 690 0.1499
0.3784 47.0 705 0.1501
0.3784 48.0 720 0.1463
0.3784 49.0 735 0.1468
0.3784 50.0 750 0.1487
0.3784 51.0 765 0.1443
0.3784 52.0 780 0.1444
0.3784 53.0 795 0.1413
0.3784 54.0 810 0.1401
0.3784 55.0 825 0.1387
0.3784 56.0 840 0.1397
0.3784 57.0 855 0.1384
0.3784 58.0 870 0.1360
0.3784 59.0 885 0.1342
0.3784 60.0 900 0.1344
0.3784 61.0 915 0.1339
0.3784 62.0 930 0.1334
0.3784 63.0 945 0.1312
0.3784 64.0 960 0.1303
0.3784 65.0 975 0.1310
0.3784 66.0 990 0.1283
0.1561 67.0 1005 0.1277
0.1561 68.0 1020 0.1268
0.1561 69.0 1035 0.1268
0.1561 70.0 1050 0.1270
0.1561 71.0 1065 0.1260
0.1561 72.0 1080 0.1250
0.1561 73.0 1095 0.1240
0.1561 74.0 1110 0.1252
0.1561 75.0 1125 0.1235
0.1561 76.0 1140 0.1225
0.1561 77.0 1155 0.1223
0.1561 78.0 1170 0.1212
0.1561 79.0 1185 0.1208
0.1561 80.0 1200 0.1212
0.1561 81.0 1215 0.1199
0.1561 82.0 1230 0.1196
0.1561 83.0 1245 0.1193
0.1561 84.0 1260 0.1192
0.1561 85.0 1275 0.1187
0.1561 86.0 1290 0.1184
0.1561 87.0 1305 0.1182
0.1561 88.0 1320 0.1179
0.1561 89.0 1335 0.1177
0.1561 90.0 1350 0.1176
0.1561 91.0 1365 0.1175
0.1561 92.0 1380 0.1173
0.1561 93.0 1395 0.1170
0.1561 94.0 1410 0.1168
0.1561 95.0 1425 0.1168
0.1561 96.0 1440 0.1166
0.1561 97.0 1455 0.1165
0.1561 98.0 1470 0.1165
0.1561 99.0 1485 0.1164
0.1234 100.0 1500 0.1164

Framework versions

  • PEFT 0.9.0
  • Transformers 4.38.1
  • Pytorch 2.1.2
  • Datasets 2.1.0
  • Tokenizers 0.15.2
Downloads last month
1
Unable to determine this model’s pipeline type. Check the docs .

Adapter for