OpenELM-1_1B-CPO / README.md
CharlesLi's picture
Model save
8e8e7ca verified
metadata
library_name: transformers
tags:
  - trl
  - cpo
  - alignment-handbook
  - generated_from_trainer
model-index:
  - name: OpenELM-1_1B-CPO
    results: []

OpenELM-1_1B-CPO

This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:

  • Logits/chosen: -8.875
  • Logits/rejected: -7.5312
  • Logps/chosen: -364.0
  • Logps/rejected: -444.0
  • Loss: 2.1904
  • Nll Loss: 1.1719
  • Rewards/accuracies: 0.5918
  • Rewards/chosen: -3.6406
  • Rewards/margins: 0.8008
  • Rewards/rejected: -4.4375

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 16
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 64
  • total_eval_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 3

Training results

Training Loss Epoch Step Logits/chosen Logits/rejected Logps/chosen Logps/rejected Validation Loss Nll Loss Rewards/accuracies Rewards/chosen Rewards/margins Rewards/rejected
2.4271 0.1047 100 -12.3125 -12.125 -336.0 -328.0 2.2959 1.0859 0.4980 -3.3594 -0.0850 -3.2812
2.2538 0.2093 200 -9.875 -9.5 -338.0 -346.0 2.1836 1.0938 0.5234 -3.3906 0.0640 -3.4531
2.1253 0.3140 300 -11.4375 -11.0 -346.0 -360.0 2.1307 1.1172 0.5176 -3.4531 0.1416 -3.5938
2.0609 0.4186 400 -11.125 -10.625 -332.0 -344.0 2.1359 1.0703 0.5293 -3.3281 0.1187 -3.4375
2.1905 0.5233 500 -9.3125 -8.5 -338.0 -352.0 2.1286 1.0859 0.5254 -3.375 0.1357 -3.5156
2.1304 0.6279 600 -10.625 -9.625 -360.0 -398.0 2.1410 1.1562 0.5723 -3.6094 0.3672 -3.9688
2.2554 0.7326 700 -9.6875 -8.5625 -374.0 -416.0 2.1848 1.2031 0.5664 -3.7344 0.4258 -4.1562
2.0796 0.8373 800 -7.8438 -7.0312 -346.0 -374.0 2.1224 1.1172 0.5469 -3.4531 0.2852 -3.75
2.1021 0.9419 900 -6.2812 -5.2812 -350.0 -390.0 2.1099 1.1328 0.5723 -3.5 0.4062 -3.9062
1.5182 1.0471 1000 -10.625 -9.375 -350.0 -386.0 2.1662 1.125 0.5664 -3.5 0.3633 -3.8594
1.4917 1.1518 1100 -7.875 -6.4688 -356.0 -400.0 2.1588 1.1484 0.5703 -3.5625 0.4395 -4.0
1.5219 1.2564 1200 -7.7812 -6.6562 -364.0 -420.0 2.1449 1.1719 0.5938 -3.625 0.5586 -4.1875
1.5292 1.3611 1300 -8.875 -7.75 -354.0 -402.0 2.1489 1.1406 0.5742 -3.5312 0.4785 -4.0
1.4257 1.4657 1400 -9.25 -7.7188 -358.0 -410.0 2.1193 1.1562 0.5801 -3.5781 0.5156 -4.0938
1.4366 1.5704 1500 -8.9375 -7.6875 -358.0 -416.0 2.0983 1.1562 0.5898 -3.5938 0.5586 -4.1562
1.5246 1.6750 1600 -6.9062 -5.4688 -358.0 -420.0 2.1191 1.1562 0.5938 -3.5781 0.625 -4.2188
1.4534 1.7797 1700 -10.0625 -9.0625 -348.0 -404.0 2.0829 1.1172 0.5762 -3.4688 0.5625 -4.0312
1.4551 1.8844 1800 -8.1875 -6.8438 -356.0 -416.0 2.1033 1.1484 0.5898 -3.5625 0.6016 -4.1562
1.4969 1.9890 1900 -9.3125 -8.125 -354.0 -412.0 2.1046 1.1406 0.5762 -3.5312 0.5938 -4.125
0.9984 2.0937 2000 -9.1875 -7.9375 -364.0 -428.0 2.1806 1.1719 0.5781 -3.6406 0.6367 -4.2812
0.9885 2.1983 2100 -8.6875 -7.4062 -370.0 -448.0 2.1927 1.1875 0.5801 -3.6875 0.7930 -4.5
0.9814 2.3030 2200 -8.8125 -7.5 -362.0 -436.0 2.1867 1.1719 0.5742 -3.625 0.7266 -4.3438
0.9844 2.4076 2300 -8.375 -7.125 -368.0 -452.0 2.1905 1.1875 0.5996 -3.6875 0.8438 -4.5312
0.9931 2.5123 2400 -8.6875 -7.375 -364.0 -442.0 2.1843 1.1719 0.5820 -3.6406 0.7930 -4.4375
0.9537 2.6170 2500 -8.8125 -7.5 -364.0 -446.0 2.1907 1.1719 0.5898 -3.6406 0.8125 -4.4688
0.9512 2.7216 2600 -8.8125 -7.5 -364.0 -446.0 2.1918 1.1719 0.5898 -3.6406 0.8086 -4.4375
0.9604 2.8263 2700 -8.875 -7.5312 -364.0 -442.0 2.1906 1.1719 0.5879 -3.6406 0.7969 -4.4375
1.0208 2.9309 2800 -8.875 -7.5312 -364.0 -444.0 2.1904 1.1719 0.5918 -3.6406 0.8008 -4.4375

Framework versions

  • Transformers 4.44.2
  • Pytorch 2.3.0
  • Datasets 3.0.0
  • Tokenizers 0.19.1