eurus-dpop-qlora-uf-ours-uffull-5e-7

This model is a fine-tuned version of openbmb/Eurus-7b-sft on the generation/UF and the generation/UFfull datasets. It achieves the following results on the evaluation set:

  • Loss: 0.6808
  • Positive Losses: 0.1036
  • Dpo Losses: 0.6696
  • Rewards/chosen: 0.1230
  • Rewards/rejected: 0.0729
  • Rewards/accuracies: 0.6975
  • Rewards/margins: 0.0500
  • Rewards/margins Max: 0.2007
  • Rewards/margins Min: -0.0789
  • Rewards/margins Std: 0.0927
  • Logps/rejected: -253.9477
  • Logps/chosen: -259.3652
  • Logits/rejected: -2.1481
  • Logits/chosen: -2.2548

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-07
  • train_batch_size: 4
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 2
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 16
  • total_eval_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss Positive Losses Dpo Losses Rewards/chosen Rewards/rejected Rewards/accuracies Rewards/margins Rewards/margins Max Rewards/margins Min Rewards/margins Std Logps/rejected Logps/chosen Logits/rejected Logits/chosen
0.6966 0.02 100 0.6947 0.0174 0.6931 0.0034 0.0033 0.4815 0.0001 0.0072 -0.0065 0.0045 -260.9084 -271.3212 -2.1669 -2.2779
0.6937 0.05 200 0.6935 0.0077 0.6928 0.0060 0.0052 0.5330 0.0007 0.0097 -0.0072 0.0055 -260.7179 -271.0638 -2.1818 -2.2914
0.6938 0.07 300 0.6927 0.0055 0.6923 0.0089 0.0073 0.5690 0.0017 0.0142 -0.0089 0.0076 -260.5158 -270.7685 -2.1744 -2.2845
0.6923 0.1 400 0.6924 0.0093 0.6915 0.0145 0.0112 0.5825 0.0034 0.0247 -0.0138 0.0127 -260.1245 -270.2083 -2.1746 -2.2844
0.6912 0.12 500 0.6915 0.0110 0.6904 0.0206 0.0150 0.6125 0.0056 0.0353 -0.0182 0.0176 -259.7408 -269.6042 -2.1638 -2.2741
0.6925 0.14 600 0.6901 0.0059 0.6896 0.0296 0.0222 0.6220 0.0073 0.0451 -0.0227 0.0222 -259.0201 -268.7073 -2.1660 -2.2762
0.6881 0.17 700 0.6894 0.0133 0.6883 0.0360 0.0261 0.6240 0.0100 0.0579 -0.0279 0.0280 -258.6333 -268.0581 -2.1708 -2.2802
0.6855 0.19 800 0.6900 0.0361 0.6866 0.0429 0.0294 0.6420 0.0135 0.0737 -0.0336 0.0351 -258.3028 -267.3751 -2.1658 -2.2752
0.6812 0.22 900 0.6879 0.0427 0.6840 0.0768 0.0578 0.6425 0.0190 0.1031 -0.0471 0.0492 -255.4605 -263.9813 -2.1635 -2.2725
0.6825 0.24 1000 0.6868 0.0569 0.6814 0.1024 0.0776 0.6360 0.0248 0.1315 -0.0608 0.0632 -253.4859 -261.4236 -2.1594 -2.2679
0.6833 0.26 1100 0.6861 0.0622 0.6797 0.1057 0.0773 0.6435 0.0284 0.1421 -0.0645 0.0678 -253.5126 -261.0904 -2.1621 -2.2707
0.6798 0.29 1200 0.6854 0.0679 0.6781 0.1059 0.0741 0.6605 0.0318 0.1506 -0.0677 0.0716 -253.8279 -261.0721 -2.1610 -2.2695
0.6803 0.31 1300 0.6875 0.1069 0.6764 0.1044 0.0690 0.6610 0.0355 0.1630 -0.0727 0.0775 -254.3442 -261.2181 -2.1593 -2.2681
0.685 0.34 1400 0.6828 0.0586 0.6764 0.1116 0.0762 0.6680 0.0354 0.1576 -0.0685 0.0742 -253.6211 -260.5063 -2.1571 -2.2658
0.6823 0.36 1500 0.6872 0.1236 0.6745 0.1060 0.0664 0.6770 0.0396 0.1744 -0.0754 0.0822 -254.6037 -261.0612 -2.1502 -2.2587
0.6854 0.38 1600 0.6883 0.1427 0.6738 0.1062 0.0650 0.6825 0.0411 0.1816 -0.0782 0.0855 -254.7401 -261.0461 -2.1471 -2.2550
0.6703 0.41 1700 0.6838 0.0966 0.6738 0.1122 0.0712 0.6795 0.0411 0.1768 -0.0749 0.0829 -254.1271 -260.4419 -2.1541 -2.2614
0.6789 0.43 1800 0.6796 0.0446 0.6744 0.1207 0.0812 0.6800 0.0395 0.1646 -0.0676 0.0767 -253.1234 -259.5977 -2.1537 -2.2616
0.6885 0.45 1900 0.6827 0.0926 0.6726 0.1144 0.0708 0.6850 0.0436 0.1800 -0.0742 0.0842 -254.1586 -260.2216 -2.1466 -2.2544
0.6735 0.48 2000 0.6811 0.0765 0.6727 0.1187 0.0755 0.6905 0.0432 0.1781 -0.0729 0.0830 -253.6908 -259.7894 -2.1524 -2.2599
0.6761 0.5 2100 0.6803 0.0718 0.6726 0.1209 0.0774 0.6915 0.0435 0.1794 -0.0728 0.0834 -253.5071 -259.5759 -2.1470 -2.2549
0.6727 0.53 2200 0.6847 0.1341 0.6711 0.1145 0.0676 0.6890 0.0469 0.1952 -0.0805 0.0911 -254.4793 -260.2140 -2.1423 -2.2500
0.6873 0.55 2300 0.6821 0.1057 0.6714 0.1183 0.0720 0.6860 0.0463 0.1919 -0.0781 0.0892 -254.0430 -259.8377 -2.1472 -2.2542
0.6739 0.57 2400 0.6798 0.0767 0.6719 0.1235 0.0785 0.6875 0.0451 0.1874 -0.0755 0.0868 -253.3946 -259.3081 -2.1432 -2.2505
0.6737 0.6 2500 0.6827 0.1178 0.6705 0.1181 0.0699 0.6890 0.0482 0.1986 -0.0798 0.0920 -254.2483 -259.8498 -2.1419 -2.2492
0.6833 0.62 2600 0.6828 0.1217 0.6701 0.1183 0.0694 0.6925 0.0490 0.1995 -0.0800 0.0925 -254.3057 -259.8303 -2.1445 -2.2515
0.6935 0.65 2700 0.6832 0.1272 0.6698 0.1180 0.0685 0.6955 0.0496 0.2013 -0.0807 0.0933 -254.3967 -259.8595 -2.1511 -2.2579
0.7108 0.67 2800 0.6803 0.0903 0.6707 0.1235 0.0759 0.6965 0.0477 0.1937 -0.0772 0.0897 -253.6575 -259.3115 -2.1538 -2.2603
0.6698 0.69 2900 0.6801 0.0884 0.6704 0.1237 0.0756 0.6945 0.0482 0.1930 -0.0763 0.0893 -253.6860 -259.2906 -2.1413 -2.2489
0.679 0.72 3000 0.6807 0.0975 0.6701 0.1226 0.0737 0.6950 0.0489 0.1957 -0.0774 0.0907 -253.8698 -259.4003 -2.1425 -2.2501
0.6674 0.74 3100 0.6814 0.1078 0.6696 0.1212 0.0712 0.6990 0.0500 0.1993 -0.0785 0.0922 -254.1188 -259.5419 -2.1483 -2.2554
0.6735 0.77 3200 0.6809 0.1022 0.6698 0.1223 0.0726 0.6990 0.0497 0.1980 -0.0779 0.0916 -253.9808 -259.4351 -2.1468 -2.2539
0.6741 0.79 3300 0.6817 0.1139 0.6694 0.1209 0.0705 0.6935 0.0504 0.2015 -0.0793 0.0932 -254.1916 -259.5721 -2.1429 -2.2503
0.6884 0.81 3400 0.6813 0.1092 0.6696 0.1218 0.0718 0.6930 0.0500 0.2008 -0.0788 0.0928 -254.0630 -259.4802 -2.1390 -2.2466
0.6708 0.84 3500 0.6810 0.1052 0.6696 0.1224 0.0723 0.6910 0.0500 0.1998 -0.0784 0.0924 -254.0100 -259.4273 -2.1448 -2.2519
0.6812 0.86 3600 0.6810 0.1051 0.6696 0.1223 0.0723 0.6930 0.0500 0.2002 -0.0787 0.0925 -254.0087 -259.4301 -2.1462 -2.2532
0.6689 0.89 3700 0.6808 0.1035 0.6697 0.1228 0.0730 0.6965 0.0499 0.1999 -0.0788 0.0924 -253.9457 -259.3808 -2.1432 -2.2504
0.6725 0.91 3800 0.6808 0.1038 0.6696 0.1229 0.0728 0.6980 0.0500 0.2004 -0.0786 0.0926 -253.9600 -259.3763 -2.1513 -2.2579
0.6712 0.93 3900 0.6808 0.1036 0.6696 0.1229 0.0728 0.6930 0.0501 0.2003 -0.0787 0.0926 -253.9589 -259.3714 -2.1501 -2.2567
0.6882 0.96 4000 0.6808 0.1041 0.6696 0.1230 0.0730 0.6940 0.0500 0.2002 -0.0786 0.0926 -253.9395 -259.3630 -2.1476 -2.2545
0.6692 0.98 4100 0.6808 0.1020 0.6696 0.1230 0.0730 0.6950 0.0501 0.2003 -0.0787 0.0926 -253.9474 -259.3606 -2.1399 -2.2473

Framework versions

  • PEFT 0.7.1
  • Transformers 4.39.0.dev0
  • Pytorch 2.1.2+cu121
  • Datasets 2.14.6
  • Tokenizers 0.15.2
Downloads last month
1
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no pipeline_tag.

Model tree for just1nseo/eurus-dpop-qlora-uf-ours-uffull-5e-7

Adapter
(18)
this model