Edit model card

hoa-1b4_model_kc_server_March9th_time_format

This model is a fine-tuned version of vlsp-2023-vllm/hoa-1b4 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0459

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 4e-05
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss
No log 1.0 465 0.1350
0.4563 2.0 930 0.0705
0.0994 3.0 1395 0.0517
0.0605 4.0 1860 0.0436
0.051 5.0 2325 0.0369
0.043 6.0 2790 0.0355
0.0372 7.0 3255 0.0336
0.0365 8.0 3720 0.0330
0.034 9.0 4185 0.0299
0.031 10.0 4650 0.0296
0.0293 11.0 5115 0.0284
0.0285 12.0 5580 0.0305
0.0276 13.0 6045 0.0299
0.0281 14.0 6510 0.0276
0.0281 15.0 6975 0.0296
0.0268 16.0 7440 0.0290
0.0261 17.0 7905 0.0295
0.0259 18.0 8370 0.0320
0.0254 19.0 8835 0.0303
0.0248 20.0 9300 0.0284
0.0247 21.0 9765 0.0262
0.0239 22.0 10230 0.0298
0.0238 23.0 10695 0.0386
0.0234 24.0 11160 0.0306
0.0242 25.0 11625 0.0311
0.0232 26.0 12090 0.0342
0.0235 27.0 12555 0.0303
0.0228 28.0 13020 0.0341
0.0228 29.0 13485 0.0317
0.0229 30.0 13950 0.0295
0.0223 31.0 14415 0.0301
0.0221 32.0 14880 0.0314
0.0219 33.0 15345 0.0349
0.0221 34.0 15810 0.0383
0.0219 35.0 16275 0.0324
0.0213 36.0 16740 0.0319
0.0222 37.0 17205 0.0371
0.0217 38.0 17670 0.0355
0.0217 39.0 18135 0.0331
0.021 40.0 18600 0.0323
0.0205 41.0 19065 0.0417
0.0212 42.0 19530 0.0333
0.0212 43.0 19995 0.0400
0.021 44.0 20460 0.0357
0.0212 45.0 20925 0.0377
0.021 46.0 21390 0.0358
0.0203 47.0 21855 0.0356
0.0207 48.0 22320 0.0371
0.0205 49.0 22785 0.0408
0.0203 50.0 23250 0.0409
0.0197 51.0 23715 0.0460
0.0209 52.0 24180 0.0396
0.0204 53.0 24645 0.0396
0.0204 54.0 25110 0.0465
0.0199 55.0 25575 0.0419
0.0202 56.0 26040 0.0485
0.0198 57.0 26505 0.0426
0.0198 58.0 26970 0.0389
0.0199 59.0 27435 0.0412
0.0197 60.0 27900 0.0355
0.0197 61.0 28365 0.0414
0.02 62.0 28830 0.0420
0.0197 63.0 29295 0.0333
0.0196 64.0 29760 0.0443
0.0195 65.0 30225 0.0428
0.0193 66.0 30690 0.0409
0.0195 67.0 31155 0.0409
0.0195 68.0 31620 0.0392
0.0192 69.0 32085 0.0429
0.0192 70.0 32550 0.0400
0.0191 71.0 33015 0.0358
0.0191 72.0 33480 0.0378
0.0193 73.0 33945 0.0434
0.0193 74.0 34410 0.0395
0.0189 75.0 34875 0.0472
0.0188 76.0 35340 0.0388
0.0191 77.0 35805 0.0426
0.0182 78.0 36270 0.0460
0.0197 79.0 36735 0.0412
0.0186 80.0 37200 0.0444
0.0189 81.0 37665 0.0449
0.0183 82.0 38130 0.0487
0.0184 83.0 38595 0.0455
0.0184 84.0 39060 0.0443
0.0185 85.0 39525 0.0449
0.0185 86.0 39990 0.0430
0.0186 87.0 40455 0.0457
0.0186 88.0 40920 0.0497
0.0183 89.0 41385 0.0451
0.0182 90.0 41850 0.0465
0.0186 91.0 42315 0.0452
0.0177 92.0 42780 0.0472
0.0189 93.0 43245 0.0454
0.0175 94.0 43710 0.0449
0.0187 95.0 44175 0.0455
0.018 96.0 44640 0.0453
0.0181 97.0 45105 0.0472
0.0179 98.0 45570 0.0450
0.0179 99.0 46035 0.0462
0.0181 100.0 46500 0.0459

Framework versions

  • PEFT 0.8.2
  • Transformers 4.37.2
  • Pytorch 2.0.1+cu117
  • Datasets 2.17.0
  • Tokenizers 0.15.2
Downloads last month
1
Unable to determine this model’s pipeline type. Check the docs .

Adapter for