paligemmamultidataset

This model is a fine-tuned version of google/paligemma-3b-pt-224 on the None dataset. It achieves the following results on the evaluation set:

Loss: 2.0804

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 10
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 40
optimizer: Use OptimizerNames.ADAMW_HF with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 20
num_epochs: 12

Training results

Training Loss	Epoch	Step	Validation Loss
4.7372	0.1233	100	3.6204
3.213	0.2465	200	3.0079
2.6333	0.3698	300	2.6781
2.4789	0.4931	400	2.5150
2.2423	0.6163	500	2.4068
2.1555	0.7396	600	2.2984
2.1202	0.8629	700	2.2410
2.1345	0.9861	800	2.1763
2.0111	1.1094	900	2.1304
1.9591	1.2327	1000	2.1017
1.8412	1.3559	1100	2.0653
1.8451	1.4792	1200	2.0440
1.8383	1.6025	1300	2.0194
1.8782	1.7257	1400	1.9879
1.7502	1.8490	1500	1.9676
1.7612	1.9723	1600	1.9579
1.7285	2.0955	1700	1.9335
1.6545	2.2188	1800	1.9220
1.6289	2.3421	1900	1.9146
1.7027	2.4653	2000	1.8895
1.5917	2.5886	2100	1.8812
1.5515	2.7119	2200	1.8754
1.5598	2.8351	2300	1.8583
1.625	2.9584	2400	1.8443
1.4844	3.0817	2500	1.8452
1.4847	3.2049	2600	1.8313
1.4573	3.3282	2700	1.8216
1.446	3.4515	2800	1.8087
1.446	3.5747	2900	1.8062
1.4052	3.6980	3000	1.8127
1.4376	3.8213	3100	1.7946
1.4436	3.9445	3200	1.7834
1.3534	4.0678	3300	1.8001
1.3562	4.1911	3400	1.7946
1.3416	4.3143	3500	1.7894
1.269	4.4376	3600	1.7802
1.3105	4.5609	3700	1.7751
1.3331	4.6841	3800	1.7627
1.2788	4.8074	3900	1.7766
1.256	4.9307	4000	1.7723
1.2342	5.0539	4100	1.7943
1.1391	5.1772	4200	1.7807
1.165	5.3005	4300	1.8016
1.2122	5.4237	4400	1.7840
1.1536	5.5470	4500	1.7805
1.217	5.6703	4600	1.7775
1.1769	5.7935	4700	1.7817
1.225	5.9168	4800	1.7758
1.1306	6.0401	4900	1.8010
1.0248	6.1633	5000	1.8035
1.036	6.2866	5100	1.8228
1.1205	6.4099	5200	1.8145
1.0873	6.5331	5300	1.7970
1.0785	6.6564	5400	1.8077
1.0628	6.7797	5500	1.8102
1.0423	6.9029	5600	1.8027
1.0392	7.0262	5700	1.8268
0.9586	7.1495	5800	1.8684
0.8986	7.2727	5900	1.8406
0.9616	7.3960	6000	1.8408
1.026	7.5193	6100	1.8500
0.9494	7.6425	6200	1.8326
0.9769	7.7658	6300	1.8455
0.9322	7.8891	6400	1.8445
0.9305	8.0123	6500	1.8578
0.8791	8.1356	6600	1.8941
0.8852	8.2589	6700	1.9108
0.8771	8.3821	6800	1.8951
0.8697	8.5054	6900	1.9177
0.8676	8.6287	7000	1.9179
0.8527	8.7519	7100	1.8762
0.8281	8.8752	7200	1.9050
0.88	8.9985	7300	1.9411
0.7569	9.1217	7400	1.9684
0.7265	9.2450	7500	1.9705
0.7659	9.3683	7600	1.9602
0.7858	9.4915	7700	1.9635
0.7349	9.6148	7800	1.9725
0.7882	9.7381	7900	1.9662
0.8001	9.8613	8000	1.9443
0.8258	9.9846	8100	1.9441
0.7034	10.1079	8200	2.0047
0.6926	10.2311	8300	2.0246
0.6978	10.3544	8400	2.0163
0.7311	10.4777	8500	2.0388
0.6835	10.6009	8600	2.0432
0.7273	10.7242	8700	2.0073
0.6887	10.8475	8800	2.0164
0.7032	10.9707	8900	2.0060
0.6222	11.0940	9000	2.0942
0.6344	11.2173	9100	2.0941
0.5905	11.3405	9200	2.1227
0.6398	11.4638	9300	2.1075
0.6342	11.5871	9400	2.1051
0.6245	11.7103	9500	2.0911
0.6529	11.8336	9600	2.0880
0.6416	11.9569	9700	2.0804

Framework versions

PEFT 0.13.0
Transformers 4.46.0.dev0
Pytorch 2.4.1+cu121
Datasets 3.0.1
Tokenizers 0.20.0

RoyRoyRpy
/

paligemmamultidataset

You need to agree to share your contact information to access this model

paligemmamultidataset

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for RoyRoyRpy/paligemmamultidataset

Evaluation results