AA_preference_Cherry_0_50

This model is a fine-tuned version of llava-hf/llava-v1.6-mistral-7b-hf on the AA_preference_Cherry_0_50 dataset. It achieves the following results on the evaluation set:

Loss: 0.4853
Rewards/chosen: 2.7507
Rewards/rejected: -0.5491
Rewards/accuracies: 0.8548
Rewards/margins: 3.2998
Logps/rejected: -247.9047
Logps/chosen: -301.7189
Logits/rejected: -2.0198
Logits/chosen: -2.0547

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-06
train_batch_size: 8
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 8
gradient_accumulation_steps: 4
total_train_batch_size: 256
total_eval_batch_size: 64
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: cosine
lr_scheduler_warmup_steps: 10
num_epochs: 3.0

Training results

Training Loss	Epoch	Step	Validation Loss	Rewards/chosen	Rewards/rejected	Rewards/accuracies	Rewards/margins	Logps/rejected	Logps/chosen	Logits/rejected	Logits/chosen
0.563	0.5904	40	0.4594	2.4856	0.6720	0.8185	1.8135	-235.6931	-304.3701	-2.1588	-2.1792
0.154	1.1808	80	0.4602	2.7861	0.3929	0.8427	2.3933	-238.4848	-301.3644	-2.1104	-2.1357
0.2027	1.7712	120	0.4869	2.5066	-0.4566	0.8548	2.9632	-246.9799	-304.1596	-2.1461	-2.1726
0.0725	2.3616	160	0.4819	2.7680	-0.4125	0.8629	3.1805	-246.5386	-301.5459	-2.0424	-2.0766
0.0505	2.9520	200	0.4848	2.7520	-0.5489	0.8548	3.3009	-247.9026	-301.7058	-2.0194	-2.0544

Framework versions

Transformers 4.45.2
Pytorch 2.4.0+cu121
Datasets 2.21.0
Tokenizers 0.20.3

htlou
/

backup_0202_llamafactory_AA_preference_Cherry_0_50-llava-mistral

AA_preference_Cherry_0_50

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for htlou/backup_0202_llamafactory_AA_preference_Cherry_0_50-llava-mistral

Evaluation results