Edit model card

MIDICausalFinetuning2

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6756

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss
No log 1.0 9 7.7655
No log 2.0 18 6.4257
No log 3.0 27 5.4697
No log 4.0 36 4.9705
No log 5.0 45 4.7258
No log 6.0 54 4.5740
No log 7.0 63 4.4554
No log 8.0 72 4.3483
No log 9.0 81 4.2406
No log 10.0 90 4.1217
No log 11.0 99 3.9690
No log 12.0 108 3.7765
No log 13.0 117 3.6364
No log 14.0 126 3.5090
No log 15.0 135 3.4009
No log 16.0 144 3.2948
No log 17.0 153 3.1934
No log 18.0 162 3.1031
No log 19.0 171 3.0232
No log 20.0 180 2.9464
No log 21.0 189 2.8734
No log 22.0 198 2.8016
No log 23.0 207 2.7296
No log 24.0 216 2.6571
No log 25.0 225 2.5846
No log 26.0 234 2.5193
No log 27.0 243 2.4498
No log 28.0 252 2.3844
No log 29.0 261 2.3150
No log 30.0 270 2.2558
No log 31.0 279 2.1873
No log 32.0 288 2.1213
No log 33.0 297 2.0649
No log 34.0 306 1.9997
No log 35.0 315 1.9421
No log 36.0 324 1.8803
No log 37.0 333 1.8131
No log 38.0 342 1.7380
No log 39.0 351 1.6847
No log 40.0 360 1.5993
No log 41.0 369 1.5855
No log 42.0 378 1.5034
No log 43.0 387 1.4867
No log 44.0 396 1.4380
No log 45.0 405 1.4309
No log 46.0 414 1.3585
No log 47.0 423 1.3231
No log 48.0 432 1.3071
No log 49.0 441 1.2690
No log 50.0 450 1.2417
No log 51.0 459 1.2078
No log 52.0 468 1.1709
No log 53.0 477 1.1457
No log 54.0 486 1.1317
No log 55.0 495 1.1155
2.8999 56.0 504 1.0914
2.8999 57.0 513 1.0625
2.8999 58.0 522 1.0380
2.8999 59.0 531 1.0190
2.8999 60.0 540 0.9976
2.8999 61.0 549 0.9716
2.8999 62.0 558 0.9544
2.8999 63.0 567 0.9289
2.8999 64.0 576 0.9157
2.8999 65.0 585 0.8983
2.8999 66.0 594 0.8923
2.8999 67.0 603 0.8751
2.8999 68.0 612 0.8684
2.8999 69.0 621 0.8485
2.8999 70.0 630 0.8349
2.8999 71.0 639 0.8261
2.8999 72.0 648 0.8072
2.8999 73.0 657 0.8034
2.8999 74.0 666 0.7947
2.8999 75.0 675 0.7787
2.8999 76.0 684 0.7700
2.8999 77.0 693 0.7581
2.8999 78.0 702 0.7577
2.8999 79.0 711 0.7472
2.8999 80.0 720 0.7514
2.8999 81.0 729 0.7317
2.8999 82.0 738 0.7334
2.8999 83.0 747 0.7233
2.8999 84.0 756 0.7148
2.8999 85.0 765 0.7139
2.8999 86.0 774 0.7048
2.8999 87.0 783 0.7033
2.8999 88.0 792 0.6972
2.8999 89.0 801 0.6946
2.8999 90.0 810 0.6899
2.8999 91.0 819 0.6867
2.8999 92.0 828 0.6852
2.8999 93.0 837 0.6855
2.8999 94.0 846 0.6815
2.8999 95.0 855 0.6793
2.8999 96.0 864 0.6782
2.8999 97.0 873 0.6754
2.8999 98.0 882 0.6763
2.8999 99.0 891 0.6758
2.8999 100.0 900 0.6756

Framework versions

  • Transformers 4.41.2
  • Pytorch 2.3.0+cu121
  • Datasets 2.19.1
  • Tokenizers 0.19.1
Downloads last month
52
Safetensors
Model size
32.1M params
Tensor type
F32
·