CodeLlama-7B-Instruct-AWQ-FaVe-20epochs

This model is a fine-tuned version of TheBloke/CodeLlama-7B-Instruct-AWQ on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.4574

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 1
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 4
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 10
num_epochs: 20

Training results

Training Loss	Epoch	Step	Validation Loss
No log	0.2685	10	2.1959
2.4678	0.5369	20	2.0730
2.4678	0.8054	30	1.8722
2.0575	1.0738	40	1.5546
2.0575	1.3423	50	1.3307
1.4122	1.6107	60	1.1023
1.4122	1.8792	70	0.9597
0.9644	2.1477	80	0.8650
0.9644	2.4161	90	0.7989
0.7959	2.6846	100	0.7489
0.7959	2.9530	110	0.7237
0.6573	3.2215	120	0.6950
0.6573	3.4899	130	0.6752
0.6282	3.7584	140	0.6501
0.6282	4.0268	150	0.6392
0.6166	4.2953	160	0.6225
0.6166	4.5638	170	0.6023
0.5145	4.8322	180	0.5950
0.5145	5.1007	190	0.5716
0.5142	5.3691	200	0.5670
0.5142	5.6376	210	0.5479
0.4538	5.9060	220	0.5325
0.4538	6.1745	230	0.5155
0.4319	6.4430	240	0.5105
0.4319	6.7114	250	0.4965
0.4035	6.9799	260	0.4820
0.4035	7.2483	270	0.4844
0.3432	7.5168	280	0.4686
0.3432	7.7852	290	0.4731
0.3506	8.0537	300	0.4500
0.3506	8.3221	310	0.4558
0.3102	8.5906	320	0.4450
0.3102	8.8591	330	0.4332
0.2963	9.1275	340	0.4355
0.2963	9.3960	350	0.4487
0.2579	9.6644	360	0.4287
0.2579	9.9329	370	0.4260
0.2633	10.2013	380	0.4266
0.2633	10.4698	390	0.4280
0.2506	10.7383	400	0.4238
0.2506	11.0067	410	0.4211
0.2251	11.2752	420	0.4355
0.2251	11.5436	430	0.4196
0.1957	11.8121	440	0.4280
0.1957	12.0805	450	0.4186
0.2015	12.3490	460	0.4354
0.2015	12.6174	470	0.4257
0.2007	12.8859	480	0.4191
0.2007	13.1544	490	0.4292
0.1672	13.4228	500	0.4434
0.1672	13.6913	510	0.4279
0.1789	13.9597	520	0.4299
0.1789	14.2282	530	0.4397
0.1521	14.4966	540	0.4506
0.1521	14.7651	550	0.4382
0.1593	15.0336	560	0.4303
0.1593	15.3020	570	0.4404
0.1483	15.5705	580	0.4411
0.1483	15.8389	590	0.4421
0.1369	16.1074	600	0.4486
0.1369	16.3758	610	0.4574
0.1252	16.6443	620	0.4468
0.1252	16.9128	630	0.4419
0.147	17.1812	640	0.4456
0.147	17.4497	650	0.4553
0.1224	17.7181	660	0.4562
0.1224	17.9866	670	0.4511
0.1185	18.2550	680	0.4593
0.1185	18.5235	690	0.4641
0.1109	18.7919	700	0.4594
0.1109	19.0604	710	0.4554
0.1296	19.3289	720	0.4561
0.1296	19.5973	730	0.4572
0.1173	19.8658	740	0.4574

Framework versions

PEFT 0.10.0
Transformers 4.40.2
Pytorch 2.2.1+cu121
Datasets 2.19.1
Tokenizers 0.19.1

Ferdi
/

CodeLlama-7B-Instruct-AWQ-FaVe-20epochs

CodeLlama-7B-Instruct-AWQ-FaVe-20epochs

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Adapter for

Evaluation results

CodeLlama-7B-Instruct-AWQ-FaVe-20epochs

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Adapter for TheBloke/CodeLlama-7B-Instruct-AWQ

Evaluation results

Adapter for