metadata

library_name: transformers
license: apache-2.0
base_model: reflex-ai/AMD-Llama-350M-Upgraded
tags:
  - generated_from_trainer
model-index:
  - name: amdchess
    results: []

amdchess

This model is a fine-tuned version of reflex-ai/AMD-Llama-350M-Upgraded on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 1.6347

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 3e-05
train_batch_size: 4
eval_batch_size: 4
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: cosine
num_epochs: 0.1

Training results

Training Loss	Epoch	Step	Validation Loss
8.019	0.0012	4	7.6135
7.7094	0.0024	8	7.0826
6.8737	0.0035	12	6.8392
6.6426	0.0047	16	6.6142
6.3563	0.0059	20	6.2879
6.0826	0.0071	24	5.9688
5.8464	0.0083	28	5.5885
5.3209	0.0094	32	5.4342
5.2345	0.0106	36	5.2125
4.9003	0.0118	40	4.9282
4.6779	0.0130	44	4.7029
4.3778	0.0142	48	4.3920
4.3256	0.0154	52	4.1814
3.9975	0.0165	56	4.0072
3.73	0.0177	60	3.8358
4.0483	0.0189	64	3.7093
3.7907	0.0201	68	3.5874
3.3881	0.0213	72	3.4606
3.5066	0.0224	76	3.4071
3.3845	0.0236	80	3.2889
3.2318	0.0248	84	3.1932
3.5897	0.0260	88	3.1209
3.0362	0.0272	92	3.0123
2.7973	0.0283	96	2.9055
2.8976	0.0295	100	2.8210
2.8188	0.0307	104	2.7422
2.5149	0.0319	108	2.6395
2.495	0.0331	112	2.5714
2.5654	0.0342	116	2.4863
2.4205	0.0354	120	2.4448
2.3487	0.0366	124	2.3561
2.413	0.0378	128	2.3265
2.2713	0.0390	132	2.2814
2.2293	0.0402	136	2.2361
2.2793	0.0413	140	2.1745
2.185	0.0425	144	2.1444
2.0137	0.0437	148	2.1245
2.1408	0.0449	152	2.0849
2.1539	0.0461	156	2.0650
2.0592	0.0472	160	2.0345
1.9849	0.0484	164	2.0390
1.8796	0.0496	168	1.9978
1.9646	0.0508	172	1.9860
1.9913	0.0520	176	1.9388
1.967	0.0531	180	1.9121
1.9141	0.0543	184	1.9085
1.9513	0.0555	188	1.9040
1.9123	0.0567	192	1.8606
1.8204	0.0579	196	1.8556
1.9311	0.0590	200	1.8390
1.8425	0.0602	204	1.8162
1.7932	0.0614	208	1.7914
1.591	0.0626	212	1.7749
1.7899	0.0638	216	1.7667
1.7094	0.0650	220	1.7637
1.8023	0.0661	224	1.7458
1.7368	0.0673	228	1.7339
1.5679	0.0685	232	1.7281
1.7265	0.0697	236	1.7221
1.7034	0.0709	240	1.7093
1.5902	0.0720	244	1.7086
1.6903	0.0732	248	1.6976
1.7581	0.0744	252	1.6944
1.656	0.0756	256	1.6899
1.4287	0.0768	260	1.6858
1.6527	0.0779	264	1.6754
1.7206	0.0791	268	1.6787
1.8268	0.0803	272	1.6673
1.538	0.0815	276	1.6590
1.7374	0.0827	280	1.6711
1.7255	0.0839	284	1.6513
1.6032	0.0850	288	1.6552
1.5297	0.0862	292	1.6458
1.7639	0.0874	296	1.6488
1.8029	0.0886	300	1.6441
1.665	0.0898	304	1.6425
1.6854	0.0909	308	1.6425
1.5418	0.0921	312	1.6396
1.6943	0.0933	316	1.6373
1.6758	0.0945	320	1.6359
1.9994	0.0957	324	1.6352
1.6326	0.0968	328	1.6349
1.6935	0.0980	332	1.6348
1.6358	0.0992	336	1.6347

Framework versions

Transformers 4.44.2
Pytorch 2.5.0+cu121
Datasets 3.0.2
Tokenizers 0.19.1