Edit model card

hoa-1b4_model_118_time_March9th_format

This model is a fine-tuned version of vlsp-2023-vllm/hoa-1b4 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0472

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 4e-05
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss
No log 1.0 465 0.1367
0.4568 2.0 930 0.0717
0.0992 3.0 1395 0.0510
0.0659 4.0 1860 0.0434
0.0491 5.0 2325 0.0411
0.0406 6.0 2790 0.0374
0.0393 7.0 3255 0.0329
0.0348 8.0 3720 0.0317
0.0343 9.0 4185 0.0325
0.033 10.0 4650 0.0306
0.0304 11.0 5115 0.0283
0.0289 12.0 5580 0.0294
0.0283 13.0 6045 0.0281
0.028 14.0 6510 0.0271
0.028 15.0 6975 0.0278
0.0265 16.0 7440 0.0290
0.0263 17.0 7905 0.0296
0.0257 18.0 8370 0.0306
0.0255 19.0 8835 0.0302
0.0251 20.0 9300 0.0258
0.0239 21.0 9765 0.0324
0.0252 22.0 10230 0.0304
0.024 23.0 10695 0.0283
0.0236 24.0 11160 0.0323
0.0231 25.0 11625 0.0287
0.0233 26.0 12090 0.0337
0.0229 27.0 12555 0.0328
0.0228 28.0 13020 0.0286
0.0228 29.0 13485 0.0344
0.0224 30.0 13950 0.0340
0.0226 31.0 14415 0.0351
0.0229 32.0 14880 0.0311
0.022 33.0 15345 0.0345
0.0224 34.0 15810 0.0340
0.0219 35.0 16275 0.0336
0.0213 36.0 16740 0.0364
0.0222 37.0 17205 0.0369
0.0215 38.0 17670 0.0349
0.0209 39.0 18135 0.0382
0.0211 40.0 18600 0.0336
0.0209 41.0 19065 0.0330
0.0208 42.0 19530 0.0369
0.0208 43.0 19995 0.0380
0.0208 44.0 20460 0.0396
0.0201 45.0 20925 0.0407
0.021 46.0 21390 0.0364
0.0203 47.0 21855 0.0375
0.0205 48.0 22320 0.0414
0.0203 49.0 22785 0.0345
0.0203 50.0 23250 0.0425
0.0199 51.0 23715 0.0366
0.0205 52.0 24180 0.0423
0.0198 53.0 24645 0.0399
0.0206 54.0 25110 0.0404
0.0196 55.0 25575 0.0434
0.0201 56.0 26040 0.0362
0.02 57.0 26505 0.0474
0.02 58.0 26970 0.0367
0.0201 59.0 27435 0.0385
0.0201 60.0 27900 0.0446
0.0197 61.0 28365 0.0387
0.0196 62.0 28830 0.0399
0.0191 63.0 29295 0.0374
0.02 64.0 29760 0.0421
0.0191 65.0 30225 0.0428
0.0197 66.0 30690 0.0418
0.0189 67.0 31155 0.0441
0.0195 68.0 31620 0.0424
0.0196 69.0 32085 0.0432
0.0188 70.0 32550 0.0432
0.0194 71.0 33015 0.0475
0.0194 72.0 33480 0.0429
0.0187 73.0 33945 0.0420
0.0188 74.0 34410 0.0364
0.0189 75.0 34875 0.0461
0.0187 76.0 35340 0.0459
0.0191 77.0 35805 0.0468
0.0182 78.0 36270 0.0465
0.0185 79.0 36735 0.0412
0.0191 80.0 37200 0.0454
0.0186 81.0 37665 0.0451
0.0187 82.0 38130 0.0454
0.019 83.0 38595 0.0455
0.0184 84.0 39060 0.0430
0.0183 85.0 39525 0.0498
0.0183 86.0 39990 0.0459
0.0187 87.0 40455 0.0471
0.0183 88.0 40920 0.0497
0.0184 89.0 41385 0.0437
0.018 90.0 41850 0.0480
0.0181 91.0 42315 0.0492
0.0181 92.0 42780 0.0481
0.018 93.0 43245 0.0494
0.0183 94.0 43710 0.0495
0.0178 95.0 44175 0.0471
0.0181 96.0 44640 0.0484
0.018 97.0 45105 0.0463
0.0177 98.0 45570 0.0467
0.0175 99.0 46035 0.0479
0.018 100.0 46500 0.0472

Framework versions

  • PEFT 0.4.0
  • Transformers 4.38.2
  • Pytorch 2.0.1+cu117
  • Datasets 2.17.0
  • Tokenizers 0.15.2
Downloads last month
1
Unable to determine this model’s pipeline type. Check the docs .

Adapter for