metadata

library_name: transformers
tags:
  - trl
  - cpo
  - generated_from_trainer
model-index:
  - name: OpenELM-1_1B-SimPO
    results: []

OpenELM-1_1B-SimPO

This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.8496
Rewards/chosen: -1.1328
Rewards/rejected: -1.7031
Rewards/accuracies: 0.6680
Rewards/margins: 0.5742
Logps/rejected: -171.0
Logps/chosen: -113.0
Logits/rejected: 1.2422
Logits/chosen: -0.5781
Nll Loss: 0.0

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 16
seed: 42
distributed_type: multi-GPU
num_devices: 4
gradient_accumulation_steps: 2
total_train_batch_size: 64
total_eval_batch_size: 64
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 3

Training results

Training Loss	Epoch	Step	Validation Loss	Rewards/chosen	Rewards/rejected	Rewards/accuracies	Rewards/margins	Logps/rejected	Logps/chosen	Logits/rejected	Logits/chosen
0.9346	0.1047	100	0.9349	-0.3320	-0.4180	0.6133	0.0864	-41.75	-33.25	-7.9688	-8.5625
0.9139	0.2093	200	0.9069	-0.4844	-0.6367	0.6270	0.1504	-63.5	-48.5	-2.4375	-3.4531
0.907	0.3140	300	0.9099	-0.6914	-0.8359	0.6055	0.1416	-83.5	-69.5	-4.0	-5.1875
0.901	0.4186	400	0.8957	-0.8359	-1.0156	0.6328	0.1748	-101.0	-84.0	0.0164	-1.7422
0.8752	0.5233	500	0.8768	-0.7266	-0.9570	0.6582	0.2324	-95.5	-72.5	0.8555	-0.5625
0.8808	0.6279	600	0.8742	-0.8633	-1.0938	0.6445	0.2334	-109.5	-86.0	3.2344	2.1562
0.8277	0.7326	700	0.8679	-0.5195	-0.7734	0.6445	0.2520	-77.5	-52.0	0.3496	-0.7930
0.8341	0.8373	800	0.8503	-0.8047	-1.0859	0.6602	0.2773	-108.5	-80.5	1.3047	0.2188
0.8333	0.9419	900	0.8454	-0.8984	-1.2188	0.6660	0.3184	-121.5	-90.0	1.8438	0.6406
0.8071	1.0466	1000	0.8441	-1.0	-1.3359	0.6699	0.3340	-133.0	-100.0	1.3516	0.1504
0.7845	1.1512	1100	0.8307	-0.8477	-1.2266	0.6660	0.3809	-122.5	-84.5	0.3301	-1.5078
0.7483	1.2559	1200	0.8353	-0.9453	-1.3281	0.6758	0.3809	-133.0	-94.5	0.9805	-0.4160
0.7802	1.3605	1300	0.8363	-0.6211	-1.0	0.7051	0.3828	-100.5	-62.0	0.3418	-1.5859
0.7499	1.4652	1400	0.8228	-0.9727	-1.4141	0.7012	0.4414	-141.0	-97.0	1.4531	-0.1719
0.6966	1.5699	1500	0.8231	-1.0625	-1.5234	0.6836	0.4609	-152.0	-106.0	1.5	-0.3301
0.6921	1.6745	1600	0.8222	-1.0703	-1.5469	0.6875	0.4766	-155.0	-107.0	2.25	0.6133
0.7162	1.7792	1700	0.8106	-1.0312	-1.5391	0.6953	0.5078	-154.0	-103.0	2.4688	0.6992
0.714	1.8838	1800	0.8183	-1.0938	-1.625	0.6855	0.5312	-162.0	-109.5	2.1875	0.0579
0.7068	1.9885	1900	0.8164	-0.9727	-1.5078	0.7031	0.5352	-151.0	-97.5	1.9922	0.3184
0.4781	2.0931	2000	0.8475	-1.1875	-1.7109	0.6797	0.5273	-171.0	-119.0	1.7344	0.0977
0.4964	2.1978	2100	0.8455	-1.0	-1.5547	0.6875	0.5547	-155.0	-100.0	0.9219	-0.9258
0.4723	2.3025	2200	0.8475	-1.1016	-1.6562	0.6934	0.5586	-166.0	-110.0	1.2969	-0.4648
0.5051	2.4071	2300	0.8480	-1.1328	-1.6953	0.6895	0.5664	-170.0	-113.0	1.4141	-0.2891
0.4647	2.5118	2400	0.8463	-1.1406	-1.7188	0.6758	0.5742	-171.0	-114.0	1.4531	-0.3496
0.4442	2.6164	2500	0.8527	-1.2344	-1.7969	0.6680	0.5664	-180.0	-123.5	1.5859	-0.1436
0.4349	2.7211	2600	0.8505	-1.1172	-1.6953	0.6699	0.5742	-169.0	-112.0	1.2422	-0.5898
0.4514	2.8257	2700	0.8493	-1.1172	-1.6953	0.6738	0.5781	-169.0	-112.0	1.1953	-0.6406
0.459	2.9304	2800	0.8496	-1.1328	-1.7031	0.6680	0.5742	-171.0	-113.0	1.2422	-0.5781

Framework versions

Transformers 4.44.2
Pytorch 2.3.0
Datasets 3.0.0
Tokenizers 0.19.1