Edit model card

hoa-1b4_model_kaggle_format

This model is a fine-tuned version of vlsp-2023-vllm/hoa-1b4 on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5927

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 4e-05
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss
No log 1.0 65 2.6363
No log 2.0 130 1.8356
No log 3.0 195 1.3984
No log 4.0 260 1.1658
No log 5.0 325 0.9857
No log 6.0 390 0.8724
No log 7.0 455 0.8085
1.4171 8.0 520 0.7400
1.4171 9.0 585 0.6925
1.4171 10.0 650 0.6654
1.4171 11.0 715 0.6383
1.4171 12.0 780 0.6341
1.4171 13.0 845 0.6148
1.4171 14.0 910 0.5979
1.4171 15.0 975 0.6061
0.2596 16.0 1040 0.5960
0.2596 17.0 1105 0.5810
0.2596 18.0 1170 0.5812
0.2596 19.0 1235 0.5761
0.2596 20.0 1300 0.5724
0.2596 21.0 1365 0.5600
0.2596 22.0 1430 0.5927
0.2596 23.0 1495 0.5627
0.1245 24.0 1560 0.5500
0.1245 25.0 1625 0.5706
0.1245 26.0 1690 0.5551
0.1245 27.0 1755 0.5548
0.1245 28.0 1820 0.5573
0.1245 29.0 1885 0.5642
0.1245 30.0 1950 0.5712
0.0896 31.0 2015 0.5524
0.0896 32.0 2080 0.5644
0.0896 33.0 2145 0.5511
0.0896 34.0 2210 0.5648
0.0896 35.0 2275 0.5722
0.0896 36.0 2340 0.5619
0.0896 37.0 2405 0.5632
0.0896 38.0 2470 0.5628
0.0746 39.0 2535 0.5593
0.0746 40.0 2600 0.5624
0.0746 41.0 2665 0.5744
0.0746 42.0 2730 0.5525
0.0746 43.0 2795 0.5858
0.0746 44.0 2860 0.5615
0.0746 45.0 2925 0.5614
0.0746 46.0 2990 0.5678
0.0696 47.0 3055 0.5735
0.0696 48.0 3120 0.5674
0.0696 49.0 3185 0.5637
0.0696 50.0 3250 0.5623
0.0696 51.0 3315 0.5668
0.0696 52.0 3380 0.5625
0.0696 53.0 3445 0.5630
0.0636 54.0 3510 0.5675
0.0636 55.0 3575 0.5646
0.0636 56.0 3640 0.5702
0.0636 57.0 3705 0.5729
0.0636 58.0 3770 0.5745
0.0636 59.0 3835 0.5737
0.0636 60.0 3900 0.5724
0.0636 61.0 3965 0.5658
0.0579 62.0 4030 0.5759
0.0579 63.0 4095 0.5777
0.0579 64.0 4160 0.5722
0.0579 65.0 4225 0.5721
0.0579 66.0 4290 0.5772
0.0579 67.0 4355 0.5747
0.0579 68.0 4420 0.5800
0.0579 69.0 4485 0.5814
0.0557 70.0 4550 0.5777
0.0557 71.0 4615 0.5765
0.0557 72.0 4680 0.5790
0.0557 73.0 4745 0.5845
0.0557 74.0 4810 0.5788
0.0557 75.0 4875 0.5836
0.0557 76.0 4940 0.5911
0.052 77.0 5005 0.5841
0.052 78.0 5070 0.5822
0.052 79.0 5135 0.5828
0.052 80.0 5200 0.5868
0.052 81.0 5265 0.5858
0.052 82.0 5330 0.5899
0.052 83.0 5395 0.5888
0.052 84.0 5460 0.5871
0.0478 85.0 5525 0.5867
0.0478 86.0 5590 0.5894
0.0478 87.0 5655 0.5899
0.0478 88.0 5720 0.5899
0.0478 89.0 5785 0.5915
0.0478 90.0 5850 0.5901
0.0478 91.0 5915 0.5919
0.0478 92.0 5980 0.5919
0.0458 93.0 6045 0.5916
0.0458 94.0 6110 0.5914
0.0458 95.0 6175 0.5929
0.0458 96.0 6240 0.5920
0.0458 97.0 6305 0.5922
0.0458 98.0 6370 0.5922
0.0458 99.0 6435 0.5924
0.0425 100.0 6500 0.5927

Framework versions

  • PEFT 0.9.0
  • Transformers 4.38.1
  • Pytorch 2.1.2
  • Datasets 2.1.0
  • Tokenizers 0.15.2
Downloads last month
1
Unable to determine this model’s pipeline type. Check the docs .

Adapter for