ft-google-gemma-2b-it-qlora-v2

This model is a fine-tuned version of google/gemma-2b-it on the None dataset. It achieves the following results on the evaluation set:

Loss: 5.8028

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 3e-05
train_batch_size: 10
eval_batch_size: 10
seed: 42
gradient_accumulation_steps: 8
total_train_batch_size: 80
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: constant
num_epochs: 1000
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss
0.2955	10.0	10	2.7587
0.232	20.0	20	2.5366
0.177	30.0	30	2.4293
0.1317	40.0	40	2.4247
0.0893	50.0	50	2.5725
0.0472	60.0	60	2.8254
0.0147	70.0	70	3.2230
0.0035	80.0	80	3.7653
0.0015	90.0	90	4.0707
0.0008	100.0	100	4.2730
0.0006	110.0	110	4.3961
0.0006	120.0	120	4.4900
0.0005	130.0	130	4.5394
0.0005	140.0	140	4.5999
0.0005	150.0	150	4.6447
0.0004	160.0	160	4.6848
0.0004	170.0	170	4.7255
0.0004	180.0	180	4.7569
0.0004	190.0	190	4.7802
0.0004	200.0	200	4.8020
0.0004	210.0	210	4.8522
0.0004	220.0	220	4.8690
0.0004	230.0	230	4.8940
0.0004	240.0	240	4.9423
0.0004	250.0	250	4.9723
0.0004	260.0	260	4.9644
0.0004	270.0	270	4.9923
0.0004	280.0	280	5.0230
0.0004	290.0	290	5.0319
0.0004	300.0	300	5.0627
0.0004	310.0	310	5.1078
0.0004	320.0	320	5.1167
0.0004	330.0	330	5.1260
0.0004	340.0	340	5.1586
0.0004	350.0	350	5.1803
0.0004	360.0	360	5.1652
0.0004	370.0	370	5.1692
0.0004	380.0	380	5.1980
0.0004	390.0	390	5.2254
0.0004	400.0	400	5.2434
0.0004	410.0	410	5.2792
0.0004	420.0	420	5.2699
0.0004	430.0	430	5.2906
0.0004	440.0	440	5.3069
0.0004	450.0	450	5.3063
0.0004	460.0	460	5.3275
0.0004	470.0	470	5.3406
0.0004	480.0	480	5.3319
0.0004	490.0	490	5.3354
0.0004	500.0	500	5.3601
0.0004	510.0	510	5.4094
0.0004	520.0	520	5.4175
0.0004	530.0	530	5.4083
0.0004	540.0	540	5.3947
0.0004	550.0	550	5.4211
0.0004	560.0	560	5.4287
0.0004	570.0	570	5.4580
0.0004	580.0	580	5.4610
0.0004	590.0	590	5.4775
0.0004	600.0	600	5.5165
0.0004	610.0	610	5.5356
0.0004	620.0	620	5.5142
0.0004	630.0	630	5.4963
0.0004	640.0	640	5.5114
0.0004	650.0	650	5.5223
0.0004	660.0	660	5.5468
0.0004	670.0	670	5.5543
0.0004	680.0	680	5.5731
0.0004	690.0	690	5.6010
0.0004	700.0	700	5.6050
0.0004	710.0	710	5.6203
0.0004	720.0	720	5.6415
0.0004	730.0	730	5.6312
0.0004	740.0	740	5.6209
0.0004	750.0	750	5.6283
0.0004	760.0	760	5.6605
0.0004	770.0	770	5.6683
0.0004	780.0	780	5.6686
0.0004	790.0	790	5.6810
0.0004	800.0	800	5.6837
0.0004	810.0	810	5.7018
0.0004	820.0	820	5.7189
0.0004	830.0	830	5.7218
0.0004	840.0	840	5.7053
0.0004	850.0	850	5.7328
0.0004	860.0	860	5.7495
0.0004	870.0	870	5.7220
0.0004	880.0	880	5.7142
0.0004	890.0	890	5.7272
0.0004	900.0	900	5.7643
0.0004	910.0	910	5.7750
0.0004	920.0	920	5.7762
0.0004	930.0	930	5.7899
0.0004	940.0	940	5.7878
0.0004	950.0	950	5.7727
0.0004	960.0	960	5.7630
0.0004	970.0	970	5.7806
0.0004	980.0	980	5.7953
0.0004	990.0	990	5.7662
0.0004	1000.0	1000	5.8028

Framework versions

PEFT 0.9.0
Transformers 4.38.2
Pytorch 2.2.1+cu121
Datasets 2.18.0
Tokenizers 0.15.2

ALBADDAWI
/

ft-google-gemma-2b-it-qlora-v2

ft-google-gemma-2b-it-qlora-v2

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for ALBADDAWI/ft-google-gemma-2b-it-qlora-v2

Evaluation results