collapse_gemma-2-9b_hs2_accumulate_iter2_sftsd1

This model is a fine-tuned version of google/gemma-2-9b on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
No log	0	0	1.2335	0
1.2409	0.0268	5	1.1095	255572
1.0595	0.0536	10	1.0210	515832
0.9484	0.0804	15	0.9901	775964
0.8491	0.1072	20	0.9915	1028892
0.7828	0.1340	25	0.9898	1286060
0.7926	0.1608	30	0.9886	1549820
0.6968	0.1875	35	0.9855	1809168
0.751	0.2143	40	0.9830	2071332
0.6349	0.2411	45	0.9763	2329060
0.5858	0.2679	50	0.9717	2582412
0.6271	0.2947	55	0.9682	2841400
0.539	0.3215	60	0.9675	3103564
0.6166	0.3483	65	0.9633	3371340
0.6678	0.3751	70	0.9611	3634204
0.5751	0.4019	75	0.9581	3892340
0.5311	0.4287	80	0.9560	4156988
0.6751	0.4555	85	0.9548	4419404
0.6184	0.4823	90	0.9538	4677684
0.6578	0.5090	95	0.9523	4937352
0.6409	0.5358	100	0.9522	5199988
0.6468	0.5626	105	0.9507	5461972
0.5908	0.5894	110	0.9494	5724396
0.5753	0.6162	115	0.9490	5986712
0.5835	0.6430	120	0.9489	6238272
0.4922	0.6698	125	0.9483	6502692
0.5653	0.6966	130	0.9465	6766008
0.4244	0.7234	135	0.9458	7026916
0.561	0.7502	140	0.9455	7285852
0.5852	0.7770	145	0.9460	7548120
0.5483	0.8038	150	0.9445	7813604
0.5537	0.8305	155	0.9442	8074268
0.567	0.8573	160	0.9438	8329848
0.486	0.8841	165	0.9435	8586556
0.5464	0.9109	170	0.9422	8853500
0.5167	0.9377	175	0.9406	9116632
0.5577	0.9645	180	0.9423	9374420
0.5194	0.9913	185	0.9407	9644032