amdchess / README.md
nlpguy's picture
End of training
6075130 verified
metadata
library_name: transformers
license: apache-2.0
base_model: reflex-ai/AMD-Llama-350M-Upgraded
tags:
  - generated_from_trainer
model-index:
  - name: amdchess
    results: []

amdchess

This model is a fine-tuned version of reflex-ai/AMD-Llama-350M-Upgraded on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.6347

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • num_epochs: 0.1

Training results

Training Loss Epoch Step Validation Loss
8.019 0.0012 4 7.6135
7.7094 0.0024 8 7.0826
6.8737 0.0035 12 6.8392
6.6426 0.0047 16 6.6142
6.3563 0.0059 20 6.2879
6.0826 0.0071 24 5.9688
5.8464 0.0083 28 5.5885
5.3209 0.0094 32 5.4342
5.2345 0.0106 36 5.2125
4.9003 0.0118 40 4.9282
4.6779 0.0130 44 4.7029
4.3778 0.0142 48 4.3920
4.3256 0.0154 52 4.1814
3.9975 0.0165 56 4.0072
3.73 0.0177 60 3.8358
4.0483 0.0189 64 3.7093
3.7907 0.0201 68 3.5874
3.3881 0.0213 72 3.4606
3.5066 0.0224 76 3.4071
3.3845 0.0236 80 3.2889
3.2318 0.0248 84 3.1932
3.5897 0.0260 88 3.1209
3.0362 0.0272 92 3.0123
2.7973 0.0283 96 2.9055
2.8976 0.0295 100 2.8210
2.8188 0.0307 104 2.7422
2.5149 0.0319 108 2.6395
2.495 0.0331 112 2.5714
2.5654 0.0342 116 2.4863
2.4205 0.0354 120 2.4448
2.3487 0.0366 124 2.3561
2.413 0.0378 128 2.3265
2.2713 0.0390 132 2.2814
2.2293 0.0402 136 2.2361
2.2793 0.0413 140 2.1745
2.185 0.0425 144 2.1444
2.0137 0.0437 148 2.1245
2.1408 0.0449 152 2.0849
2.1539 0.0461 156 2.0650
2.0592 0.0472 160 2.0345
1.9849 0.0484 164 2.0390
1.8796 0.0496 168 1.9978
1.9646 0.0508 172 1.9860
1.9913 0.0520 176 1.9388
1.967 0.0531 180 1.9121
1.9141 0.0543 184 1.9085
1.9513 0.0555 188 1.9040
1.9123 0.0567 192 1.8606
1.8204 0.0579 196 1.8556
1.9311 0.0590 200 1.8390
1.8425 0.0602 204 1.8162
1.7932 0.0614 208 1.7914
1.591 0.0626 212 1.7749
1.7899 0.0638 216 1.7667
1.7094 0.0650 220 1.7637
1.8023 0.0661 224 1.7458
1.7368 0.0673 228 1.7339
1.5679 0.0685 232 1.7281
1.7265 0.0697 236 1.7221
1.7034 0.0709 240 1.7093
1.5902 0.0720 244 1.7086
1.6903 0.0732 248 1.6976
1.7581 0.0744 252 1.6944
1.656 0.0756 256 1.6899
1.4287 0.0768 260 1.6858
1.6527 0.0779 264 1.6754
1.7206 0.0791 268 1.6787
1.8268 0.0803 272 1.6673
1.538 0.0815 276 1.6590
1.7374 0.0827 280 1.6711
1.7255 0.0839 284 1.6513
1.6032 0.0850 288 1.6552
1.5297 0.0862 292 1.6458
1.7639 0.0874 296 1.6488
1.8029 0.0886 300 1.6441
1.665 0.0898 304 1.6425
1.6854 0.0909 308 1.6425
1.5418 0.0921 312 1.6396
1.6943 0.0933 316 1.6373
1.6758 0.0945 320 1.6359
1.9994 0.0957 324 1.6352
1.6326 0.0968 328 1.6349
1.6935 0.0980 332 1.6348
1.6358 0.0992 336 1.6347

Framework versions

  • Transformers 4.44.2
  • Pytorch 2.5.0+cu121
  • Datasets 3.0.2
  • Tokenizers 0.19.1