Edit model card

sft-fsi

This model is a fine-tuned version of dynamofl/dynamo-1.6B-v0.4-mosaic-dynamoDPO-iter0-2978 on the dynamofl/train-default-FSI-PersonalFinancialAdvice-input-formatted-chatml dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5351

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-07
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 8
  • total_train_batch_size: 32
  • total_eval_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss
17.4572 1.0 15 17.4173
16.0 2.0 30 15.7598
13.4471 3.0 45 12.5953
12.2511 4.0 60 11.4095
11.6039 5.0 75 10.9817
10.2763 6.0 90 9.5031
9.0617 7.0 105 8.6753
8.4037 8.0 120 7.9506
7.9582 9.0 135 7.1144
6.9666 10.0 150 6.4290
6.4408 11.0 165 5.7028
5.3708 12.0 180 4.4972
4.8165 13.0 195 3.8790
4.0134 14.0 210 2.8529
3.3501 15.0 225 2.3522
3.1818 16.0 240 1.9746
2.338 17.0 255 1.7629
2.0088 18.0 270 1.6484
2.293 19.0 285 1.4006
1.82 20.0 300 1.3265
1.8957 21.0 315 1.2599
1.5477 22.0 330 1.1908
1.3785 23.0 345 1.1995
1.5653 24.0 360 1.1229
1.5203 25.0 375 1.1563
1.7603 26.0 390 1.1781
1.3828 27.0 405 0.9678
1.1726 28.0 420 1.0369
1.4392 29.0 435 1.0777
1.1965 30.0 450 0.9542
1.1961 31.0 465 0.9490
1.1002 32.0 480 0.8936
1.3295 33.0 495 1.0326
1.0566 34.0 510 1.0263
1.1966 35.0 525 0.9777
1.1547 36.0 540 0.8877
1.2921 37.0 555 0.8489
1.0368 38.0 570 0.8568
1.0894 39.0 585 0.9249
1.182 40.0 600 0.9296
1.1232 41.0 615 0.9656
1.034 42.0 630 0.8042
1.1033 43.0 645 0.8467
1.0659 44.0 660 0.8005
0.9365 45.0 675 0.8196
0.9452 46.0 690 0.7149
0.9357 47.0 705 0.7847
0.9167 48.0 720 0.6707
0.884 49.0 735 0.6987
0.9829 50.0 750 0.7260
0.7688 51.0 765 0.7078
0.9165 52.0 780 0.6694
1.074 53.0 795 0.7018
0.9647 54.0 810 0.6790
0.9155 55.0 825 0.6542
0.8819 56.0 840 0.6652
0.7332 57.0 855 0.6124
0.8385 58.0 870 0.6184
0.7709 59.0 885 0.6434
0.9069 60.0 900 0.6387
0.8426 61.0 915 0.5717
0.8469 62.0 930 0.6204
0.7304 63.0 945 0.6720
0.7256 64.0 960 0.5895
0.6442 65.0 975 0.6164
0.744 66.0 990 0.5816
0.7043 67.0 1005 0.6566
0.8757 68.0 1020 0.6042
0.7355 69.0 1035 0.5842
0.7304 70.0 1050 0.5986
0.8012 71.0 1065 0.6174
0.7211 72.0 1080 0.5787
0.7411 73.0 1095 0.5619
0.8447 74.0 1110 0.5611
0.7919 75.0 1125 0.6355
0.6498 76.0 1140 0.5658
0.682 77.0 1155 0.5776
0.7562 78.0 1170 0.6282
0.7869 79.0 1185 0.5271
0.7478 80.0 1200 0.5542
0.7653 81.0 1215 0.5682
0.7067 82.0 1230 0.6346
0.691 83.0 1245 0.5932
0.7489 84.0 1260 0.5724
0.694 85.0 1275 0.5307
0.7985 86.0 1290 0.6010
0.7029 87.0 1305 0.5514
0.7678 88.0 1320 0.5660
0.7885 89.0 1335 0.5434
0.6703 90.0 1350 0.5838
0.7028 91.0 1365 0.5275
0.7731 92.0 1380 0.5433
0.6815 93.0 1395 0.5619
0.5923 94.0 1410 0.5609
0.7039 95.0 1425 0.5246
0.7842 96.0 1440 0.5473
0.7001 97.0 1455 0.5467
0.7169 98.0 1470 0.5881
0.6552 99.0 1485 0.5636
0.6765 100.0 1500 0.5571

Framework versions

  • Transformers 4.40.1
  • Pytorch 2.3.0+cu121
  • Datasets 2.19.0
  • Tokenizers 0.19.1
Downloads last month
0
Safetensors
Model size
1.63B params
Tensor type
BF16
·
Invalid base_model specified in model card metadata. Needs to be a model id from hf.co/models.