Edit model card

openchat-3.6-8b-20240522_iter2

This model is a fine-tuned version of RyanYr/openchat-3.6-8b-20240522_iter1 on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5377
  • Rewards/chosen: -0.2459
  • Rewards/rejected: -0.7375
  • Rewards/accuracies: 0.7200
  • Rewards/margins: 0.4916
  • Logps/rejected: -139.4380
  • Logps/chosen: -132.3574
  • Logits/rejected: -1.3194
  • Logits/chosen: -1.3369

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-07
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 64
  • total_eval_batch_size: 4
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 2

Training results

Training Loss Epoch Step Validation Loss Rewards/chosen Rewards/rejected Rewards/accuracies Rewards/margins Logps/rejected Logps/chosen Logits/rejected Logits/chosen
0.7344 0.1140 100 0.7121 0.3107 0.3572 0.4000 -0.0465 -128.4905 -126.7911 -1.4523 -1.4659
0.6553 0.2281 200 0.6966 0.2237 0.2513 0.5200 -0.0276 -129.5494 -127.6609 -1.4428 -1.4571
0.669 0.3421 300 0.6798 0.1031 0.0581 0.5200 0.0450 -131.4815 -128.8666 -1.4122 -1.4281
0.6402 0.4561 400 0.6595 0.0694 -0.0114 0.6400 0.0808 -132.1772 -129.2041 -1.4254 -1.4406
0.6716 0.5702 500 0.6351 0.1022 -0.0221 0.6400 0.1243 -132.2838 -128.8764 -1.4550 -1.4689
0.655 0.6842 600 0.6278 0.1039 -0.0286 0.6000 0.1325 -132.3487 -128.8587 -1.4625 -1.4766
0.5943 0.7982 700 0.6084 0.0643 -0.1073 0.6400 0.1716 -133.1360 -129.2548 -1.4485 -1.4622
0.6048 0.9123 800 0.6002 0.0902 -0.1175 0.6800 0.2077 -133.2379 -128.9962 -1.4607 -1.4735
0.4934 1.0263 900 0.5798 0.0298 -0.2745 0.7200 0.3043 -134.8078 -129.5996 -1.4349 -1.4491
0.4284 1.1403 1000 0.5724 -0.1252 -0.4897 0.6800 0.3645 -136.9601 -131.1501 -1.3824 -1.3981
0.4132 1.2544 1100 0.5563 -0.1930 -0.5928 0.7600 0.3998 -137.9906 -131.8278 -1.3545 -1.3715
0.3957 1.3684 1200 0.5543 -0.2162 -0.6427 0.7600 0.4264 -138.4894 -132.0604 -1.3412 -1.3583
0.4893 1.4824 1300 0.5476 -0.2078 -0.6782 0.7200 0.4704 -138.8445 -131.9757 -1.3340 -1.3521
0.4361 1.5965 1400 0.5413 -0.2007 -0.6908 0.7200 0.4901 -138.9703 -131.9046 -1.3316 -1.3490
0.4406 1.7105 1500 0.5477 -0.2466 -0.6913 0.7200 0.4448 -138.9762 -132.3638 -1.3242 -1.3421
0.3988 1.8245 1600 0.5449 -0.2388 -0.7225 0.7200 0.4838 -139.2881 -132.2855 -1.3254 -1.3431
0.4044 1.9386 1700 0.5377 -0.2459 -0.7375 0.7200 0.4916 -139.4380 -132.3574 -1.3194 -1.3369

Framework versions

  • Transformers 4.43.4
  • Pytorch 2.1.2+cu121
  • Datasets 2.14.6
  • Tokenizers 0.19.1
Downloads last month
38
Safetensors
Model size
8.03B params
Tensor type
BF16
·
Inference API
Unable to determine this model's library. Check the docs .

Model tree for RyanYr/openchat-3.6-8b-20240522_iter2