Edit model card

hoa-1b4_model_kaggle_test

This model is a fine-tuned version of vlsp-2023-vllm/hoa-1b4 on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0088

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 4e-05
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss
No log 1.0 7 3.2766
No log 2.0 14 2.8527
No log 3.0 21 2.6111
No log 4.0 28 2.3834
No log 5.0 35 2.1930
No log 6.0 42 2.0336
No log 7.0 49 1.8676
No log 8.0 56 1.7066
No log 9.0 63 1.5550
No log 10.0 70 1.4148
No log 11.0 77 1.2909
No log 12.0 84 1.1698
No log 13.0 91 1.0494
No log 14.0 98 0.9417
No log 15.0 105 0.8308
No log 16.0 112 0.7236
No log 17.0 119 0.6239
No log 18.0 126 0.5360
No log 19.0 133 0.4547
No log 20.0 140 0.3911
No log 21.0 147 0.3288
No log 22.0 154 0.2807
No log 23.0 161 0.2396
No log 24.0 168 0.2013
No log 25.0 175 0.1764
No log 26.0 182 0.1483
No log 27.0 189 0.1240
No log 28.0 196 0.1128
No log 29.0 203 0.0983
No log 30.0 210 0.0868
No log 31.0 217 0.0775
No log 32.0 224 0.0722
No log 33.0 231 0.0613
No log 34.0 238 0.0570
No log 35.0 245 0.0495
No log 36.0 252 0.0441
No log 37.0 259 0.0409
No log 38.0 266 0.0360
No log 39.0 273 0.0314
No log 40.0 280 0.0296
No log 41.0 287 0.0250
No log 42.0 294 0.0231
No log 43.0 301 0.0241
No log 44.0 308 0.0196
No log 45.0 315 0.0183
No log 46.0 322 0.0176
No log 47.0 329 0.0173
No log 48.0 336 0.0143
No log 49.0 343 0.0145
No log 50.0 350 0.0138
No log 51.0 357 0.0131
No log 52.0 364 0.0138
No log 53.0 371 0.0142
No log 54.0 378 0.0137
No log 55.0 385 0.0119
No log 56.0 392 0.0123
No log 57.0 399 0.0122
No log 58.0 406 0.0112
No log 59.0 413 0.0114
No log 60.0 420 0.0112
No log 61.0 427 0.0108
No log 62.0 434 0.0105
No log 63.0 441 0.0120
No log 64.0 448 0.0110
No log 65.0 455 0.0115
No log 66.0 462 0.0102
No log 67.0 469 0.0104
No log 68.0 476 0.0113
No log 69.0 483 0.0098
No log 70.0 490 0.0101
No log 71.0 497 0.0101
0.4941 72.0 504 0.0094
0.4941 73.0 511 0.0097
0.4941 74.0 518 0.0095
0.4941 75.0 525 0.0101
0.4941 76.0 532 0.0095
0.4941 77.0 539 0.0100
0.4941 78.0 546 0.0092
0.4941 79.0 553 0.0094
0.4941 80.0 560 0.0095
0.4941 81.0 567 0.0094
0.4941 82.0 574 0.0093
0.4941 83.0 581 0.0093
0.4941 84.0 588 0.0094
0.4941 85.0 595 0.0092
0.4941 86.0 602 0.0089
0.4941 87.0 609 0.0090
0.4941 88.0 616 0.0090
0.4941 89.0 623 0.0089
0.4941 90.0 630 0.0088
0.4941 91.0 637 0.0089
0.4941 92.0 644 0.0088
0.4941 93.0 651 0.0088
0.4941 94.0 658 0.0088
0.4941 95.0 665 0.0088
0.4941 96.0 672 0.0088
0.4941 97.0 679 0.0088
0.4941 98.0 686 0.0088
0.4941 99.0 693 0.0088
0.4941 100.0 700 0.0088

Framework versions

  • PEFT 0.9.0
  • Transformers 4.38.1
  • Pytorch 2.1.2
  • Datasets 2.1.0
  • Tokenizers 0.15.2
Downloads last month
1
Unable to determine this model’s pipeline type. Check the docs .

Adapter for