spin-trans

This model is a fine-tuned version of alignment-handbook/zephyr-7b-sft-full on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.0027
Rewards/real: -3.8149
Rewards/generated: -24.3554
Rewards/accuracies: 1.0
Rewards/margins: 20.5405
Logps/generated: -336.8123
Logps/real: -163.0993
Logits/generated: -2.3894
Logits/real: -1.8917

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-07
train_batch_size: 8
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 4
total_train_batch_size: 32
total_eval_batch_size: 32
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.1
num_epochs: 1

Training results

Training Loss	Epoch	Step	Validation Loss	Rewards/real	Rewards/generated	Rewards/accuracies	Rewards/margins	Logps/generated	Logps/real	Logits/generated	Logits/real
0.0085	0.1	100	0.0130	0.5228	-9.3297	1.0	9.8526	-186.5559	-119.7219	-2.7911	-2.5502
0.0041	0.21	200	0.0070	-0.1706	-14.7969	1.0	14.6263	-241.2277	-126.6563	-2.6228	-2.2904
0.0007	0.31	300	0.0073	-2.7706	-22.3901	0.9974	19.6195	-317.1598	-152.6565	-2.4825	-1.9073
0.0049	0.41	400	0.0044	-2.9093	-19.4947	1.0	16.5854	-288.2053	-154.0429	-2.6010	-2.2355
0.001	0.52	500	0.0050	-1.5600	-21.7213	1.0	20.1614	-310.4720	-140.5501	-2.5715	-2.2758
0.0004	0.62	600	0.0029	-2.4635	-24.2161	1.0	21.7526	-335.4198	-149.5852	-2.4626	-2.0545
0.0004	0.72	700	0.0034	-1.9810	-20.7429	1.0	18.7619	-300.6877	-144.7602	-2.4823	-2.0980
0.0003	0.83	800	0.0034	-4.2857	-23.6128	1.0	19.3270	-329.3861	-167.8074	-2.3861	-1.8496
0.0003	0.93	900	0.0027	-3.8149	-24.3554	1.0	20.5405	-336.8123	-163.0993	-2.3894	-1.8917

Framework versions

Transformers 4.37.0
Pytorch 2.1.2+cu121
Datasets 2.14.6
Tokenizers 0.15.2

AmberYifan
/

spin-filtered

spin-trans

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for AmberYifan/spin-filtered

Evaluation results