Edit model card

myBit-Llama2-jp-127M-3

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 13.0221

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: polynomial
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss
7.8184 1.25 10 8.3355
5.4327 2.5 20 7.6000
5.0861 3.75 30 7.8126
4.7586 5.0 40 7.5748
4.4392 6.25 50 7.4509
4.1938 7.5 60 7.3834
4.0095 8.75 70 7.2750
3.905 10.0 80 7.3800
3.6536 11.25 90 7.4560
3.3187 12.5 100 7.6310
3.3315 13.75 110 8.0397
2.9308 15.0 120 8.3902
2.679 16.25 130 9.0364
2.2896 17.5 140 9.8766
1.8407 18.75 150 10.7682
1.5081 20.0 160 11.7175
0.9778 21.25 170 12.8239
0.6572 22.5 180 13.6506
0.5411 23.75 190 14.2579
0.44 25.0 200 14.5732
0.3283 26.25 210 15.1087
0.2507 27.5 220 15.0569
0.2044 28.75 230 15.1893
0.1838 30.0 240 15.6291
0.1626 31.25 250 15.4617
0.1124 32.5 260 15.2738
0.1011 33.75 270 15.2130
0.0845 35.0 280 15.2749
0.0852 36.25 290 15.3292
0.1025 37.5 300 15.1574
0.1075 38.75 310 15.1100
0.079 40.0 320 14.8177
0.0857 41.25 330 14.8609
0.0629 42.5 340 14.6443
0.0713 43.75 350 14.5514
0.0594 45.0 360 14.6032
0.0557 46.25 370 14.3489
0.0554 47.5 380 14.3289
0.0548 48.75 390 14.1991
0.0528 50.0 400 14.1350
0.0515 51.25 410 13.9952
0.0529 52.5 420 13.9788
0.0516 53.75 430 13.9438
0.0506 55.0 440 13.8746
0.049 56.25 450 13.7564
0.0491 57.5 460 13.7900
0.0493 58.75 470 13.6992
0.0491 60.0 480 13.6421
0.0497 61.25 490 13.6419
0.0489 62.5 500 13.5448
0.0504 63.75 510 13.5048
0.0508 65.0 520 13.5077
0.0488 66.25 530 13.5045
0.0485 67.5 540 13.4404
0.0493 68.75 550 13.4167
0.0507 70.0 560 13.3758
0.0491 71.25 570 13.3239
0.0484 72.5 580 13.3139
0.0472 73.75 590 13.2933
0.0493 75.0 600 13.3105
0.0475 76.25 610 13.2306
0.0465 77.5 620 13.2378
0.0474 78.75 630 13.2074
0.0468 80.0 640 13.1871
0.0466 81.25 650 13.2055
0.0459 82.5 660 13.1327
0.0466 83.75 670 13.1801
0.0485 85.0 680 13.1610
0.046 86.25 690 13.1439
0.0467 87.5 700 13.1114
0.0455 88.75 710 13.1123
0.0456 90.0 720 13.0635
0.0447 91.25 730 13.0997
0.0449 92.5 740 13.0704
0.0453 93.75 750 13.0531
0.0451 95.0 760 13.0432
0.0442 96.25 770 13.0311
0.0444 97.5 780 13.0329
0.0432 98.75 790 13.0491
0.0442 100.0 800 13.0221

Framework versions

  • Transformers 4.39.1
  • Pytorch 2.2.1+cu121
  • Datasets 2.18.0
  • Tokenizers 0.15.2
Downloads last month
1
Safetensors
Model size
128M params
Tensor type
F32
·
Inference API (serverless) does not yet support model repos that contain custom code.