Edit model card

hoa-1b4_model_nmt_test

This model is a fine-tuned version of vlsp-2023-vllm/hoa-1b4 on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0045

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 4e-05
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss
No log 1.0 21 2.8255
No log 2.0 42 2.3028
No log 3.0 63 1.8727
No log 4.0 84 1.5161
No log 5.0 105 1.2181
No log 6.0 126 0.9991
No log 7.0 147 0.7980
No log 8.0 168 0.6372
No log 9.0 189 0.5075
No log 10.0 210 0.4042
No log 11.0 231 0.3321
No log 12.0 252 0.2716
No log 13.0 273 0.2143
No log 14.0 294 0.1740
No log 15.0 315 0.1397
No log 16.0 336 0.1263
No log 17.0 357 0.0990
No log 18.0 378 0.0853
No log 19.0 399 0.0678
No log 20.0 420 0.0546
No log 21.0 441 0.0476
No log 22.0 462 0.0441
No log 23.0 483 0.0367
0.7202 24.0 504 0.0292
0.7202 25.0 525 0.0241
0.7202 26.0 546 0.0227
0.7202 27.0 567 0.0207
0.7202 28.0 588 0.0186
0.7202 29.0 609 0.0168
0.7202 30.0 630 0.0139
0.7202 31.0 651 0.0126
0.7202 32.0 672 0.0113
0.7202 33.0 693 0.0113
0.7202 34.0 714 0.0107
0.7202 35.0 735 0.0099
0.7202 36.0 756 0.0087
0.7202 37.0 777 0.0085
0.7202 38.0 798 0.0080
0.7202 39.0 819 0.0077
0.7202 40.0 840 0.0072
0.7202 41.0 861 0.0071
0.7202 42.0 882 0.0070
0.7202 43.0 903 0.0068
0.7202 44.0 924 0.0064
0.7202 45.0 945 0.0063
0.7202 46.0 966 0.0061
0.7202 47.0 987 0.0061
0.0146 48.0 1008 0.0060
0.0146 49.0 1029 0.0058
0.0146 50.0 1050 0.0059
0.0146 51.0 1071 0.0067
0.0146 52.0 1092 0.0056
0.0146 53.0 1113 0.0055
0.0146 54.0 1134 0.0055
0.0146 55.0 1155 0.0053
0.0146 56.0 1176 0.0055
0.0146 57.0 1197 0.0055
0.0146 58.0 1218 0.0057
0.0146 59.0 1239 0.0053
0.0146 60.0 1260 0.0052
0.0146 61.0 1281 0.0052
0.0146 62.0 1302 0.0051
0.0146 63.0 1323 0.0050
0.0146 64.0 1344 0.0049
0.0146 65.0 1365 0.0050
0.0146 66.0 1386 0.0049
0.0146 67.0 1407 0.0049
0.0146 68.0 1428 0.0050
0.0146 69.0 1449 0.0049
0.0146 70.0 1470 0.0049
0.0146 71.0 1491 0.0048
0.0064 72.0 1512 0.0048
0.0064 73.0 1533 0.0047
0.0064 74.0 1554 0.0048
0.0064 75.0 1575 0.0048
0.0064 76.0 1596 0.0047
0.0064 77.0 1617 0.0047
0.0064 78.0 1638 0.0047
0.0064 79.0 1659 0.0047
0.0064 80.0 1680 0.0048
0.0064 81.0 1701 0.0046
0.0064 82.0 1722 0.0046
0.0064 83.0 1743 0.0046
0.0064 84.0 1764 0.0046
0.0064 85.0 1785 0.0046
0.0064 86.0 1806 0.0046
0.0064 87.0 1827 0.0046
0.0064 88.0 1848 0.0046
0.0064 89.0 1869 0.0046
0.0064 90.0 1890 0.0046
0.0064 91.0 1911 0.0045
0.0064 92.0 1932 0.0045
0.0064 93.0 1953 0.0045
0.0064 94.0 1974 0.0045
0.0064 95.0 1995 0.0045
0.0052 96.0 2016 0.0045
0.0052 97.0 2037 0.0045
0.0052 98.0 2058 0.0045
0.0052 99.0 2079 0.0045
0.0052 100.0 2100 0.0045

Framework versions

  • PEFT 0.8.2
  • Transformers 4.37.2
  • Pytorch 2.0.1+cu117
  • Datasets 2.14.5
  • Tokenizers 0.15.2
Downloads last month
1
Unable to determine this model’s pipeline type. Check the docs .

Adapter for