mambarim-110m-chat

This model is a fine-tuned version of dominguesm/mambarim-110m on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss
2.8055	0.0545	1000	2.7821
2.8298	0.1089	2000	2.7619
2.9104	0.1634	3000	2.7539
2.6692	0.2178	4000	2.7379
2.5876	0.2723	5000	2.7325
2.7439	0.3267	6000	2.7203
2.7787	0.3812	7000	2.7178
2.8461	0.4356	8000	2.7117
2.6929	0.4901	9000	2.7060
2.7229	0.5445	10000	2.7005
2.5014	0.5990	11000	2.6948
2.5046	0.6535	12000	2.6923
2.6258	0.7079	13000	2.6898
2.5822	0.7624	14000	2.6847
2.6399	0.8168	15000	2.6847
2.5342	0.8713	16000	2.6768
2.6878	0.9257	17000	2.6726
2.8872	0.9802	18000	2.6729
2.6565	1.0346	19000	2.6693
2.4293	1.0891	20000	2.6672
2.8411	1.1435	21000	2.6620
2.7126	1.1980	22000	2.6618
2.5516	1.2525	23000	2.6609
2.6093	1.3069	24000	2.6557
2.6489	1.3614	25000	2.6554
2.6014	1.4158	26000	2.6522
2.6185	1.4703	27000	2.6477
2.6896	1.5247	28000	2.6468
2.6222	1.5792	29000	2.6433
2.6227	1.6336	30000	2.6415
2.5772	1.6881	31000	2.6377
2.4859	1.7425	32000	2.6356
2.3725	1.7970	33000	2.6327
2.5452	1.8514	34000	2.6308
2.6545	1.9059	35000	2.6281
2.6109	1.9604	36000	2.6265
2.5004	2.0148	37000	2.6237
2.4471	2.0693	38000	2.6236
2.5242	2.1237	39000	2.6211
2.6242	2.1782	40000	2.6175
2.561	2.2326	41000	2.6168
2.5065	2.2871	42000	2.6149
2.6165	2.3415	43000	2.6122
2.4452	2.3960	44000	2.6098
2.6277	2.4504	45000	2.6075
2.5547	2.5049	46000	2.6062
2.5153	2.5594	47000	2.6028
2.6322	2.6138	48000	2.6020
2.5263	2.6683	49000	2.5995
2.7165	2.7227	50000	2.5974
2.6576	2.7772	51000	2.5956
2.5471	2.8316	52000	2.5940
2.7174	2.8861	53000	2.5923
2.5018	2.9405	54000	2.5910
2.6201	2.9950	55000	2.5904