eurus-dpop-qlora-uf-ours-uffull-5e-6
This model is a fine-tuned version of openbmb/Eurus-7b-sft on the generation/UF and the generation/UFfull datasets. It achieves the following results on the evaluation set:
- Loss: 0.6837
- Positive Losses: 0.3336
- Dpo Losses: 0.6519
- Rewards/chosen: 0.1618
- Rewards/rejected: 0.0697
- Rewards/accuracies: 0.6960
- Rewards/margins: 0.0921
- Rewards/margins Max: 0.3623
- Rewards/margins Min: -0.1184
- Rewards/margins Std: 0.1602
- Logps/rejected: -254.2729
- Logps/chosen: -255.4841
- Logits/rejected: -2.1811
- Logits/chosen: -2.2772
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-06
- train_batch_size: 4
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 2
- gradient_accumulation_steps: 2
- total_train_batch_size: 16
- total_eval_batch_size: 16
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 1
Training results
Training Loss | Epoch | Step | Validation Loss | Positive Losses | Dpo Losses | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Rewards/margins Max | Rewards/margins Min | Rewards/margins Std | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0.6927 | 0.02 | 100 | 0.6925 | 0.0046 | 0.6922 | 0.0145 | 0.0127 | 0.5480 | 0.0018 | 0.0204 | -0.0140 | 0.0112 | -259.9735 | -270.2098 | -2.1753 | -2.2851 |
0.6928 | 0.05 | 200 | 0.6926 | 0.0501 | 0.6881 | 0.0395 | 0.0291 | 0.5960 | 0.0104 | 0.0706 | -0.0363 | 0.0351 | -258.3304 | -267.7088 | -2.1615 | -2.2708 |
0.6854 | 0.07 | 300 | 0.6829 | 0.0095 | 0.6820 | 0.1239 | 0.1004 | 0.6485 | 0.0235 | 0.1213 | -0.0594 | 0.0594 | -251.2040 | -259.2754 | -2.1597 | -2.2680 |
0.7146 | 0.1 | 400 | 0.6961 | 0.2328 | 0.6708 | 0.1096 | 0.0619 | 0.6845 | 0.0477 | 0.1987 | -0.0845 | 0.0939 | -255.0527 | -260.7050 | -2.1479 | -2.2540 |
0.7008 | 0.12 | 500 | 0.6895 | 0.2048 | 0.6680 | 0.1200 | 0.0660 | 0.6815 | 0.0540 | 0.2196 | -0.0893 | 0.1035 | -254.6419 | -259.6605 | -2.1377 | -2.2443 |
0.6931 | 0.14 | 600 | 0.6821 | 0.1306 | 0.6672 | 0.1310 | 0.0755 | 0.6920 | 0.0555 | 0.2219 | -0.0883 | 0.1038 | -253.6935 | -258.5599 | -2.1359 | -2.2425 |
0.6684 | 0.17 | 700 | 0.6805 | 0.1322 | 0.6669 | 0.1446 | 0.0879 | 0.6850 | 0.0567 | 0.2378 | -0.0948 | 0.1112 | -252.4529 | -257.2010 | -2.1330 | -2.2403 |
0.6892 | 0.19 | 800 | 0.6876 | 0.2539 | 0.6631 | 0.1390 | 0.0738 | 0.6900 | 0.0653 | 0.2606 | -0.1012 | 0.1216 | -253.8653 | -257.7604 | -2.1681 | -2.2734 |
0.6887 | 0.22 | 900 | 0.6834 | 0.2059 | 0.6632 | 0.1491 | 0.0839 | 0.6825 | 0.0652 | 0.2685 | -0.1056 | 0.1243 | -252.8537 | -256.7542 | -2.1617 | -2.2645 |
0.6785 | 0.24 | 1000 | 0.6885 | 0.2658 | 0.6623 | 0.1477 | 0.0803 | 0.6890 | 0.0674 | 0.2739 | -0.1111 | 0.1279 | -253.2097 | -256.8882 | -2.1750 | -2.2765 |
0.7053 | 0.26 | 1100 | 0.6771 | 0.1115 | 0.6671 | 0.1649 | 0.1075 | 0.6645 | 0.0574 | 0.2616 | -0.1163 | 0.1261 | -250.4924 | -255.1730 | -2.1717 | -2.2734 |
0.6642 | 0.29 | 1200 | 0.7051 | 0.4704 | 0.6577 | 0.1311 | 0.0532 | 0.6960 | 0.0778 | 0.3075 | -0.1132 | 0.1409 | -255.9189 | -258.5546 | -2.2010 | -2.3000 |
0.7044 | 0.31 | 1300 | 0.6949 | 0.3566 | 0.6595 | 0.1433 | 0.0695 | 0.6955 | 0.0737 | 0.2961 | -0.1130 | 0.1359 | -254.2908 | -257.3372 | -2.2117 | -2.3129 |
0.6837 | 0.34 | 1400 | 0.6878 | 0.3032 | 0.6602 | 0.1478 | 0.0755 | 0.7030 | 0.0723 | 0.2937 | -0.1114 | 0.1341 | -253.6898 | -256.8842 | -2.1897 | -2.2887 |
0.7168 | 0.36 | 1500 | 0.6919 | 0.3620 | 0.6580 | 0.1429 | 0.0657 | 0.7010 | 0.0773 | 0.3068 | -0.1198 | 0.1417 | -254.6766 | -257.3681 | -2.1618 | -2.2599 |
0.6956 | 0.38 | 1600 | 0.6867 | 0.3031 | 0.6593 | 0.1451 | 0.0710 | 0.7010 | 0.0741 | 0.2933 | -0.1161 | 0.1363 | -254.1435 | -257.1503 | -2.1631 | -2.2605 |
0.6737 | 0.41 | 1700 | 0.6772 | 0.1621 | 0.6641 | 0.1706 | 0.1066 | 0.6875 | 0.0640 | 0.2794 | -0.1154 | 0.1311 | -250.5799 | -254.6045 | -2.1643 | -2.2628 |
0.6826 | 0.43 | 1800 | 0.6796 | 0.2140 | 0.6617 | 0.1639 | 0.0948 | 0.6980 | 0.0691 | 0.2894 | -0.1164 | 0.1344 | -251.7600 | -255.2692 | -2.1697 | -2.2688 |
0.7015 | 0.45 | 1900 | 0.6877 | 0.3326 | 0.6575 | 0.1498 | 0.0714 | 0.7045 | 0.0783 | 0.3065 | -0.1114 | 0.1391 | -254.1016 | -256.6878 | -2.1641 | -2.2624 |
0.6828 | 0.48 | 2000 | 0.6894 | 0.3704 | 0.6549 | 0.1504 | 0.0661 | 0.7035 | 0.0844 | 0.3245 | -0.1181 | 0.1476 | -254.6347 | -256.6188 | -2.1769 | -2.2757 |
0.6658 | 0.5 | 2100 | 0.6846 | 0.3040 | 0.6561 | 0.1566 | 0.0749 | 0.7000 | 0.0818 | 0.3202 | -0.1168 | 0.1455 | -253.7529 | -255.9983 | -2.1652 | -2.2649 |
0.6767 | 0.53 | 2200 | 0.6849 | 0.3118 | 0.6572 | 0.1553 | 0.0760 | 0.6965 | 0.0794 | 0.3173 | -0.1198 | 0.1465 | -253.6463 | -256.1299 | -2.1629 | -2.2619 |
0.696 | 0.55 | 2300 | 0.6847 | 0.3145 | 0.6563 | 0.1533 | 0.0716 | 0.6985 | 0.0817 | 0.3263 | -0.1227 | 0.1504 | -254.0864 | -256.3344 | -2.1556 | -2.2545 |
0.6564 | 0.57 | 2400 | 0.6804 | 0.2738 | 0.6559 | 0.1592 | 0.0770 | 0.6950 | 0.0822 | 0.3235 | -0.1154 | 0.1471 | -253.5391 | -255.7443 | -2.1666 | -2.2650 |
0.6717 | 0.6 | 2500 | 0.6916 | 0.4126 | 0.6519 | 0.1507 | 0.0589 | 0.7025 | 0.0918 | 0.3565 | -0.1218 | 0.1596 | -255.3540 | -256.5908 | -2.1731 | -2.2721 |
0.7191 | 0.62 | 2600 | 0.6875 | 0.3646 | 0.6521 | 0.1553 | 0.0637 | 0.6995 | 0.0916 | 0.3583 | -0.1217 | 0.1599 | -254.8743 | -256.1363 | -2.1882 | -2.2857 |
0.6916 | 0.65 | 2700 | 0.6833 | 0.3114 | 0.6534 | 0.1588 | 0.0703 | 0.6905 | 0.0885 | 0.3483 | -0.1184 | 0.1559 | -254.2137 | -255.7853 | -2.1821 | -2.2801 |
0.7428 | 0.67 | 2800 | 0.6791 | 0.2390 | 0.6559 | 0.1679 | 0.0853 | 0.6905 | 0.0826 | 0.3337 | -0.1153 | 0.1493 | -252.7114 | -254.8721 | -2.1918 | -2.2884 |
0.6581 | 0.69 | 2900 | 0.6784 | 0.2405 | 0.6550 | 0.1690 | 0.0841 | 0.6955 | 0.0848 | 0.3421 | -0.1187 | 0.1536 | -252.8289 | -254.7648 | -2.1957 | -2.2932 |
0.6691 | 0.72 | 3000 | 0.6819 | 0.2928 | 0.6533 | 0.1651 | 0.0763 | 0.6930 | 0.0888 | 0.3514 | -0.1185 | 0.1566 | -253.6139 | -255.1557 | -2.1800 | -2.2776 |
0.6585 | 0.74 | 3100 | 0.6880 | 0.3758 | 0.6514 | 0.1589 | 0.0655 | 0.6980 | 0.0934 | 0.3651 | -0.1195 | 0.1618 | -254.6915 | -255.7693 | -2.1771 | -2.2747 |
0.6689 | 0.77 | 3200 | 0.6863 | 0.3558 | 0.6515 | 0.1606 | 0.0676 | 0.6980 | 0.0930 | 0.3645 | -0.1198 | 0.1617 | -254.4865 | -255.6035 | -2.1869 | -2.2837 |
0.6608 | 0.79 | 3300 | 0.6889 | 0.3924 | 0.6507 | 0.1575 | 0.0626 | 0.6965 | 0.0948 | 0.3683 | -0.1195 | 0.1630 | -254.9793 | -255.9175 | -2.1791 | -2.2756 |
0.7082 | 0.81 | 3400 | 0.6822 | 0.3032 | 0.6528 | 0.1632 | 0.0733 | 0.6910 | 0.0899 | 0.3556 | -0.1170 | 0.1576 | -253.9138 | -255.3429 | -2.1835 | -2.2794 |
0.647 | 0.84 | 3500 | 0.6828 | 0.3155 | 0.6524 | 0.1628 | 0.0719 | 0.6965 | 0.0910 | 0.3582 | -0.1174 | 0.1586 | -254.0551 | -255.3804 | -2.1825 | -2.2785 |
0.7095 | 0.86 | 3600 | 0.6850 | 0.3475 | 0.6516 | 0.1604 | 0.0677 | 0.6975 | 0.0927 | 0.3628 | -0.1186 | 0.1605 | -254.4760 | -255.6236 | -2.1845 | -2.2802 |
0.6504 | 0.89 | 3700 | 0.6835 | 0.3264 | 0.6521 | 0.1622 | 0.0707 | 0.6965 | 0.0916 | 0.3609 | -0.1177 | 0.1595 | -254.1773 | -255.4386 | -2.1823 | -2.2783 |
0.6851 | 0.91 | 3800 | 0.6839 | 0.3360 | 0.6517 | 0.1613 | 0.0688 | 0.6955 | 0.0925 | 0.3629 | -0.1186 | 0.1605 | -254.3600 | -255.5343 | -2.1854 | -2.2812 |
0.6898 | 0.93 | 3900 | 0.6840 | 0.3369 | 0.6518 | 0.1614 | 0.0690 | 0.6950 | 0.0924 | 0.3633 | -0.1186 | 0.1606 | -254.3441 | -255.5256 | -2.1876 | -2.2832 |
0.6931 | 0.96 | 4000 | 0.6838 | 0.3337 | 0.6519 | 0.1618 | 0.0697 | 0.6960 | 0.0921 | 0.3623 | -0.1186 | 0.1603 | -254.2723 | -255.4866 | -2.1902 | -2.2856 |
0.6552 | 0.98 | 4100 | 0.6837 | 0.3337 | 0.6519 | 0.1619 | 0.0696 | 0.6960 | 0.0922 | 0.3631 | -0.1185 | 0.1604 | -254.2784 | -255.4766 | -2.1871 | -2.2827 |
Framework versions
- PEFT 0.7.1
- Transformers 4.39.0.dev0
- Pytorch 2.1.2+cu121
- Datasets 2.14.6
- Tokenizers 0.15.2
- Downloads last month
- 1
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API:
The model has no pipeline_tag.
Model tree for just1nseo/eurus-dpop-qlora-uf-ours-uffull-5e-6
Base model
openbmb/Eurus-7b-sft