Edit model card

zephyr-7b-dpo-qlora-v1

This model is a fine-tuned version of alignment-handbook/zephyr-7b-sft-qlora on the HuggingFaceH4/ultrafeedback_binarized dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4853
  • Rewards/chosen: -1.9997
  • Rewards/rejected: -3.0850
  • Rewards/accuracies: 0.6725
  • Rewards/margins: 1.0854
  • Logps/rejected: -520.1135
  • Logps/chosen: -431.9709
  • Logits/rejected: -0.9261
  • Logits/chosen: -1.0556

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-06
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • distributed_type: multi-GPU
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 4
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss Rewards/chosen Rewards/rejected Rewards/accuracies Rewards/margins Logps/rejected Logps/chosen Logits/rejected Logits/chosen
0.6933 0.01 100 0.6927 0.0023 0.0014 0.4950 0.0009 -211.4760 -231.7798 -2.1609 -2.3494
0.691 0.01 200 0.6900 0.0094 0.0031 0.5825 0.0063 -211.3033 -231.0670 -2.1586 -2.3468
0.6796 0.02 300 0.6832 0.0364 0.0156 0.5785 0.0208 -210.0561 -228.3676 -2.1598 -2.3479
0.6558 0.03 400 0.6709 0.0348 -0.0139 0.6030 0.0487 -213.0039 -228.5253 -2.1556 -2.3431
0.6509 0.03 500 0.6525 -0.0685 -0.1665 0.6060 0.0980 -228.2622 -238.8526 -2.1523 -2.3397
0.6521 0.04 600 0.6306 -0.1447 -0.3161 0.6010 0.1714 -243.2220 -246.4779 -2.2043 -2.3956
0.6828 0.05 700 0.6355 -0.4797 -0.6338 0.5995 0.1541 -274.9947 -279.9760 -2.2205 -2.4135
0.6578 0.05 800 0.6070 -0.4183 -0.6993 0.6050 0.2810 -281.5427 -273.8341 -2.2567 -2.4512
0.6272 0.06 900 0.6149 -0.2798 -0.5197 0.6060 0.2398 -263.5772 -259.9874 -2.1332 -2.3184
0.6772 0.07 1000 0.5979 -0.5950 -0.8996 0.6125 0.3045 -301.5699 -291.5083 -2.0915 -2.2731
0.629 0.07 1100 0.5842 -1.1663 -1.5846 0.6255 0.4183 -370.0742 -348.6391 -1.8959 -2.0642
0.6763 0.08 1200 0.5800 -1.2262 -1.6772 0.625 0.4510 -379.3279 -354.6231 -1.7782 -1.9453
0.6468 0.09 1300 0.5959 -1.4323 -1.7335 0.6265 0.3012 -384.9624 -375.2356 -1.8355 -2.0034
0.5302 0.09 1400 0.5790 -1.0222 -1.4230 0.6370 0.4008 -353.9126 -334.2268 -1.8706 -2.0396
0.5512 0.1 1500 0.5627 -0.8389 -1.3789 0.6370 0.5400 -349.4973 -315.8946 -1.6729 -1.8295
0.6386 0.1 1600 0.5758 -0.8213 -1.2877 0.6245 0.4664 -340.3790 -314.1301 -1.5010 -1.6500
0.5515 0.11 1700 0.5789 -0.6172 -1.0478 0.6155 0.4306 -316.3881 -293.7214 -1.4651 -1.6102
0.5693 0.12 1800 0.5637 -0.9140 -1.3485 0.6435 0.4346 -346.4651 -323.4023 -1.5711 -1.7296
0.4312 0.12 1900 0.5713 -1.6389 -2.2013 0.6300 0.5624 -431.7438 -395.8936 -1.3446 -1.4912
0.6104 0.13 2000 0.5692 -2.5833 -3.1248 0.6295 0.5416 -524.0952 -490.3331 -1.1864 -1.3215
0.589 0.14 2100 0.5548 -1.2062 -1.8842 0.6355 0.6780 -400.0314 -352.6257 -1.4682 -1.6258
0.632 0.14 2200 0.5550 -1.7218 -2.4957 0.6340 0.7739 -461.1841 -404.1832 -0.9609 -1.0862
0.5211 0.15 2300 0.5417 -0.9631 -1.6396 0.6375 0.6765 -375.5683 -328.3126 -1.2698 -1.4156
0.4854 0.16 2400 0.5439 -1.4291 -2.0590 0.6405 0.6299 -417.5105 -374.9135 -1.1047 -1.2360
0.4768 0.16 2500 0.5402 -2.0118 -2.7496 0.6360 0.7377 -486.5682 -433.1884 -0.8693 -0.9927
0.562 0.17 2600 0.5278 -2.0156 -2.7483 0.6605 0.7326 -486.4391 -433.5695 -0.8911 -1.0129
0.4748 0.18 2700 0.5315 -1.4482 -2.1044 0.6515 0.6562 -422.0545 -376.8264 -1.1406 -1.2759
0.5099 0.18 2800 0.5306 -1.6029 -2.2872 0.6550 0.6843 -440.3303 -392.2982 -0.9484 -1.0749
0.4184 0.19 2900 0.5267 -1.6154 -2.4104 0.6515 0.7949 -452.6504 -393.5496 -0.7930 -0.9077
0.468 0.2 3000 0.5223 -1.7343 -2.5635 0.6555 0.8291 -467.9596 -405.4379 -0.8916 -1.0169
0.5857 0.2 3100 0.5290 -1.2637 -1.9922 0.6520 0.7284 -410.8308 -358.3795 -1.1037 -1.2386
0.4504 0.21 3200 0.5196 -2.6280 -3.5656 0.6515 0.9376 -568.1714 -494.8058 -0.9832 -1.1167
0.5336 0.22 3300 0.5212 -1.3201 -2.1095 0.6515 0.7894 -422.5596 -364.0115 -1.0917 -1.2265
0.5781 0.22 3400 0.5176 -1.7501 -2.6224 0.6575 0.8723 -473.8530 -407.0196 -0.9397 -1.0673
0.4228 0.23 3500 0.5153 -1.7241 -2.5518 0.6590 0.8277 -466.7913 -404.4118 -1.0211 -1.1501
0.5345 0.24 3600 0.5146 -1.9883 -2.7936 0.6580 0.8054 -490.9767 -430.8306 -0.7439 -0.8562
0.6089 0.24 3700 0.5182 -2.4209 -3.3002 0.6505 0.8794 -541.6331 -474.0902 -1.0100 -1.1421
0.4123 0.25 3800 0.5434 -3.5880 -4.2465 0.6360 0.6585 -636.2662 -590.8090 -0.5056 -0.6039
0.6359 0.26 3900 0.5269 -2.6651 -3.5331 0.6410 0.8680 -564.9203 -498.5152 -0.6802 -0.7944
0.5634 0.26 4000 0.5224 -2.3672 -3.1722 0.6515 0.8050 -528.8313 -468.7206 -0.9063 -1.0345
0.7537 0.27 4100 0.5229 -1.2274 -2.0411 0.6525 0.8138 -415.7260 -354.7430 -1.3053 -1.4554
0.5164 0.27 4200 0.5161 -2.2621 -3.1010 0.6490 0.8389 -521.7140 -458.2183 -0.9361 -1.0663
0.6486 0.28 4300 0.5247 -0.7764 -1.5282 0.6550 0.7518 -364.4350 -309.6467 -1.3301 -1.4797
0.4663 0.29 4400 0.5215 -1.6682 -2.6407 0.6525 0.9725 -475.6791 -398.8208 -0.9512 -1.0872
0.5322 0.29 4500 0.5166 -2.3459 -3.2929 0.6485 0.9470 -540.9030 -466.5963 -0.9451 -1.0830
0.5485 0.3 4600 0.5371 -1.2907 -1.8740 0.6510 0.5833 -399.0143 -361.0744 -1.2451 -1.3869
0.4012 0.31 4700 0.5190 -2.6301 -3.6818 0.6515 1.0518 -579.7961 -495.0129 -0.8302 -0.9635
0.4963 0.31 4800 0.5126 -1.9284 -3.0117 0.6540 1.0832 -512.7780 -424.8492 -1.0117 -1.1538
0.5004 0.32 4900 0.5151 -2.9464 -3.7231 0.6615 0.7767 -583.9199 -526.6473 -0.7704 -0.8908
0.465 0.33 5000 0.5096 -2.3399 -3.2128 0.6675 0.8729 -532.8920 -465.9922 -0.9343 -1.0639
0.4609 0.33 5100 0.5073 -1.9864 -2.8868 0.6655 0.9004 -500.2922 -430.6409 -0.9175 -1.0513
0.4666 0.34 5200 0.5154 -1.5968 -2.3504 0.6600 0.7536 -446.6525 -391.6843 -1.0364 -1.1704
0.6107 0.35 5300 0.5146 -2.2432 -3.1008 0.6570 0.8577 -521.6948 -456.3209 -0.8068 -0.9357
0.5853 0.35 5400 0.5090 -1.6956 -2.5963 0.6625 0.9008 -471.2449 -401.5629 -0.9616 -1.0984
0.5086 0.36 5500 0.5214 -1.7374 -2.4619 0.6595 0.7245 -457.7994 -405.7403 -0.9733 -1.1007
0.4764 0.37 5600 0.5124 -1.6197 -2.4123 0.6625 0.7927 -452.8468 -393.9726 -0.9317 -1.0609
0.6562 0.37 5700 0.5097 -1.3717 -2.1420 0.6710 0.7703 -425.8073 -369.1749 -1.0711 -1.2060
0.5178 0.38 5800 0.5039 -1.3554 -2.3601 0.6615 1.0047 -447.6251 -367.5433 -1.1354 -1.2822
0.5391 0.39 5900 0.5039 -1.3774 -2.2739 0.6615 0.8965 -439.0063 -369.7460 -1.1068 -1.2484
0.4757 0.39 6000 0.5028 -1.5428 -2.4713 0.6655 0.9286 -458.7466 -386.2829 -0.9611 -1.0946
0.5633 0.4 6100 0.5061 -1.4468 -2.3254 0.6605 0.8786 -444.1477 -376.6841 -0.8871 -1.0140
0.4512 0.41 6200 0.5027 -1.1960 -2.0747 0.6590 0.8787 -419.0789 -351.6017 -0.9586 -1.0898
0.4765 0.41 6300 0.5008 -2.1828 -3.1237 0.6655 0.9408 -523.9770 -450.2899 -0.7242 -0.8425
0.5056 0.42 6400 0.5051 -1.7258 -2.6125 0.6590 0.8868 -472.8661 -404.5825 -0.9811 -1.1095
0.5037 0.43 6500 0.5053 -2.3741 -3.2980 0.6645 0.9240 -541.4145 -469.4124 -0.9467 -1.0773
0.5839 0.43 6600 0.5009 -1.4314 -2.3462 0.6710 0.9149 -446.2347 -375.1405 -1.2409 -1.3891
0.6173 0.44 6700 0.5004 -1.8395 -2.7068 0.6695 0.8673 -482.2916 -415.9502 -1.2478 -1.3958
0.4917 0.44 6800 0.4987 -1.8070 -2.6650 0.6670 0.8580 -478.1150 -412.7094 -1.1952 -1.3386
0.4834 0.45 6900 0.4964 -2.4167 -3.3898 0.6680 0.9731 -550.5955 -473.6739 -0.8230 -0.9490
0.4668 0.46 7000 0.5033 -1.6735 -2.5449 0.6700 0.8714 -466.1047 -399.3541 -1.1272 -1.2659
0.4544 0.46 7100 0.4963 -1.5912 -2.5910 0.6715 0.9997 -470.7080 -391.1266 -0.9393 -1.0685
0.5048 0.47 7200 0.5001 -1.6418 -2.4761 0.6675 0.8344 -459.2229 -396.1804 -0.9988 -1.1263
0.5141 0.48 7300 0.4977 -2.0855 -3.2272 0.6680 1.1416 -534.3281 -440.5570 -0.8169 -0.9431
0.646 0.48 7400 0.4976 -1.9253 -2.8543 0.6680 0.9290 -497.0415 -424.5315 -0.9287 -1.0571
0.3417 0.49 7500 0.4937 -1.7911 -2.8197 0.6715 1.0286 -493.5840 -411.1139 -1.0098 -1.1436
0.4662 0.5 7600 0.5001 -1.5015 -2.5022 0.6670 1.0007 -461.8301 -382.1551 -1.1592 -1.2992
0.5059 0.5 7700 0.4979 -1.4138 -2.3752 0.6710 0.9614 -449.1288 -373.3851 -1.1849 -1.3246
0.4464 0.51 7800 0.5017 -2.2094 -3.1960 0.6740 0.9866 -531.2133 -452.9458 -0.9725 -1.0978
0.3597 0.52 7900 0.4956 -1.7191 -2.8268 0.6725 1.1077 -494.2937 -403.9176 -0.9468 -1.0762
0.6685 0.52 8000 0.4940 -2.1435 -3.1275 0.6695 0.9839 -524.3576 -446.3575 -0.7171 -0.8314
0.5494 0.53 8100 0.4914 -2.1363 -3.2125 0.6655 1.0762 -532.8622 -445.6346 -0.8910 -1.0210
0.4703 0.54 8200 0.4949 -2.0165 -2.9677 0.6660 0.9512 -508.3776 -433.6510 -1.0550 -1.1886
0.4901 0.54 8300 0.4976 -1.8477 -2.7569 0.6635 0.9092 -487.3053 -416.7779 -1.0724 -1.2041
0.4759 0.55 8400 0.4949 -2.4730 -3.5475 0.6655 1.0744 -566.3603 -479.3096 -0.8860 -1.0123
0.5511 0.56 8500 0.4967 -2.6613 -3.8456 0.6690 1.1843 -596.1694 -498.1316 -0.8653 -0.9928
0.4126 0.56 8600 0.4945 -1.8268 -2.8529 0.6665 1.0261 -496.9024 -414.6831 -1.1029 -1.2387
0.4881 0.57 8700 0.4980 -1.5900 -2.6377 0.6620 1.0477 -475.3844 -391.0065 -1.0996 -1.2381
0.4813 0.58 8800 0.4959 -1.8619 -2.9832 0.6620 1.1213 -509.9336 -418.1949 -1.0136 -1.1491
0.535 0.58 8900 0.4916 -2.0436 -3.1481 0.6660 1.1045 -526.4249 -436.3648 -0.9509 -1.0819
0.5399 0.59 9000 0.4938 -1.9094 -3.0372 0.6630 1.1278 -515.3349 -422.9481 -0.9098 -1.0398
0.512 0.6 9100 0.4937 -1.5132 -2.4976 0.6730 0.9844 -461.3710 -383.3268 -1.0658 -1.2002
0.5069 0.6 9200 0.4931 -1.7907 -2.7553 0.6715 0.9646 -487.1392 -411.0757 -0.9101 -1.0346
0.4272 0.61 9300 0.4919 -1.8152 -2.8886 0.6730 1.0734 -500.4742 -413.5278 -0.9300 -1.0575
0.4398 0.62 9400 0.4936 -2.0627 -3.0248 0.6705 0.9621 -514.0956 -438.2756 -0.8459 -0.9658
0.498 0.62 9500 0.4930 -2.5316 -3.6053 0.6645 1.0737 -572.1414 -485.1664 -0.6523 -0.7637
0.4865 0.63 9600 0.4916 -2.4312 -3.5934 0.6685 1.1621 -570.9479 -475.1278 -0.6562 -0.7693
0.5823 0.63 9700 0.4904 -2.5963 -3.6784 0.6705 1.0821 -579.4501 -491.6361 -0.6136 -0.7246
0.5332 0.64 9800 0.4906 -2.5457 -3.6787 0.6705 1.1330 -579.4781 -486.5714 -0.5180 -0.6230
0.524 0.65 9900 0.4901 -2.1327 -3.1507 0.6750 1.0180 -526.6770 -445.2742 -0.6355 -0.7448
0.4316 0.65 10000 0.4896 -1.9944 -3.0402 0.6725 1.0458 -515.6310 -431.4487 -0.7432 -0.8593
0.3164 0.66 10100 0.4900 -1.8657 -2.9973 0.6715 1.1316 -511.3380 -418.5705 -0.8276 -0.9510
0.517 0.67 10200 0.4926 -2.3350 -3.3238 0.6680 0.9887 -543.9870 -465.5092 -0.7372 -0.8519
0.4479 0.67 10300 0.4911 -2.3958 -3.4309 0.6640 1.0351 -554.7045 -471.5843 -0.7681 -0.8859
0.4663 0.68 10400 0.4915 -2.0540 -3.1053 0.6675 1.0513 -522.1436 -437.4019 -0.8684 -0.9939
0.5752 0.69 10500 0.4915 -2.0426 -3.1656 0.6680 1.1230 -528.1689 -436.2607 -0.9209 -1.0516
0.463 0.69 10600 0.4911 -1.9536 -3.0610 0.6655 1.1073 -517.7099 -427.3689 -0.8792 -1.0066
0.5865 0.7 10700 0.4881 -2.2678 -3.3722 0.6680 1.1044 -548.8290 -458.7841 -0.7627 -0.8827
0.3972 0.71 10800 0.4904 -2.3637 -3.4886 0.6690 1.1249 -560.4706 -468.3778 -0.7830 -0.9055
0.5572 0.71 10900 0.4892 -2.3609 -3.5063 0.6680 1.1454 -562.2438 -468.0954 -0.7710 -0.8925
0.6689 0.72 11000 0.4884 -2.2106 -3.2813 0.6685 1.0707 -539.7462 -453.0659 -0.8341 -0.9571
0.4435 0.73 11100 0.4877 -2.1188 -3.2148 0.6705 1.0960 -533.0965 -443.8869 -0.8864 -1.0134
0.5282 0.73 11200 0.4871 -2.0567 -3.1524 0.6715 1.0957 -526.8535 -437.6731 -0.9027 -1.0309
0.4652 0.74 11300 0.4870 -1.8621 -2.9346 0.6690 1.0725 -505.0730 -418.2159 -0.9259 -1.0542
0.4956 0.75 11400 0.4867 -2.0149 -3.1930 0.6725 1.1781 -530.9140 -433.4950 -0.8660 -0.9940
0.5636 0.75 11500 0.4873 -2.1217 -3.2145 0.6705 1.0928 -533.0626 -444.1773 -0.8628 -0.9883
0.4554 0.76 11600 0.4888 -2.2988 -3.3917 0.6705 1.0929 -550.7822 -461.8896 -0.8416 -0.9660
0.4871 0.77 11700 0.4900 -2.3167 -3.3673 0.6655 1.0507 -548.3438 -463.6716 -0.8322 -0.9553
0.527 0.77 11800 0.4890 -1.9018 -2.9657 0.6690 1.0639 -508.1792 -422.1820 -0.9603 -1.0908
0.569 0.78 11900 0.4888 -2.0736 -3.1962 0.6670 1.1225 -531.2298 -439.3680 -0.9052 -1.0341
0.4233 0.79 12000 0.4888 -2.0965 -3.1915 0.6705 1.0950 -530.7664 -441.6599 -0.9173 -1.0466
0.3903 0.79 12100 0.4903 -1.6617 -2.7032 0.6665 1.0414 -481.9285 -398.1773 -1.0563 -1.1908
0.4775 0.8 12200 0.4900 -1.6698 -2.7266 0.6680 1.0568 -484.2725 -398.9855 -1.0601 -1.1954
0.4513 0.8 12300 0.4890 -1.6321 -2.6987 0.6705 1.0666 -481.4833 -395.2168 -1.0618 -1.1973
0.5514 0.81 12400 0.4893 -1.6054 -2.6422 0.6665 1.0368 -475.8312 -392.5486 -1.0565 -1.1916
0.4187 0.82 12500 0.4877 -1.6813 -2.7806 0.6685 1.0993 -489.6676 -400.1340 -1.0093 -1.1437
0.549 0.82 12600 0.4874 -1.6772 -2.7981 0.6695 1.1209 -491.4220 -399.7243 -1.0171 -1.1529
0.5839 0.83 12700 0.4880 -1.6149 -2.7051 0.6690 1.0903 -482.1238 -393.4917 -1.0345 -1.1701
0.6596 0.84 12800 0.4864 -1.7916 -2.8825 0.6705 1.0909 -499.8600 -411.1650 -0.9965 -1.1303
0.5277 0.84 12900 0.4859 -1.8558 -2.9500 0.6695 1.0942 -506.6070 -417.5810 -0.9771 -1.1100
0.4608 0.85 13000 0.4859 -1.8954 -2.9737 0.6735 1.0783 -508.9827 -421.5428 -0.9614 -1.0929
0.5661 0.86 13100 0.4860 -1.8942 -2.9630 0.6725 1.0688 -507.9122 -421.4239 -0.9514 -1.0824
0.4732 0.86 13200 0.4857 -1.8424 -2.9279 0.6705 1.0855 -504.4016 -416.2484 -0.9614 -1.0934
0.5427 0.87 13300 0.4858 -1.9079 -3.0019 0.6710 1.0941 -511.8058 -422.7933 -0.9451 -1.0766
0.5223 0.88 13400 0.4863 -1.9008 -2.9681 0.6720 1.0673 -508.4213 -422.0847 -0.9559 -1.0872
0.4808 0.88 13500 0.4859 -1.9388 -3.0281 0.6735 1.0893 -514.4193 -425.8812 -0.9376 -1.0681
0.5138 0.89 13600 0.4856 -1.9843 -3.0731 0.6715 1.0888 -518.9196 -430.4352 -0.9361 -1.0668
0.5878 0.9 13700 0.4855 -2.0426 -3.1226 0.6695 1.0800 -523.8743 -436.2664 -0.9280 -1.0581
0.4051 0.9 13800 0.4853 -2.0332 -3.1257 0.6725 1.0925 -524.1822 -435.3295 -0.9284 -1.0587
0.5562 0.91 13900 0.4854 -2.0142 -3.0992 0.6725 1.0850 -521.5326 -433.4284 -0.9257 -1.0554
0.4542 0.92 14000 0.4857 -2.0204 -3.0943 0.6715 1.0739 -521.0421 -434.0428 -0.9270 -1.0565
0.4657 0.92 14100 0.4855 -2.0038 -3.0783 0.6695 1.0745 -519.4431 -432.3822 -0.9273 -1.0567
0.3963 0.93 14200 0.4853 -1.9858 -3.0706 0.6710 1.0848 -518.6724 -430.5839 -0.9247 -1.0540
0.4414 0.94 14300 0.4855 -1.9946 -3.0790 0.6715 1.0844 -519.5145 -431.4666 -0.9262 -1.0557
0.5011 0.94 14400 0.4854 -1.9991 -3.0852 0.6725 1.0861 -520.1354 -431.9193 -0.9237 -1.0528
0.4677 0.95 14500 0.4853 -2.0012 -3.0897 0.6715 1.0885 -520.5853 -432.1261 -0.9249 -1.0543
0.4234 0.96 14600 0.4854 -2.0010 -3.0866 0.6710 1.0856 -520.2672 -432.1037 -0.9283 -1.0579
0.4681 0.96 14700 0.4855 -1.9998 -3.0848 0.6700 1.0851 -520.0927 -431.9801 -0.9267 -1.0560
0.4417 0.97 14800 0.4853 -2.0018 -3.0877 0.6715 1.0859 -520.3868 -432.1882 -0.9254 -1.0549
0.516 0.97 14900 0.4854 -2.0013 -3.0874 0.6700 1.0861 -520.3481 -432.1320 -0.9249 -1.0543
0.5369 0.98 15000 0.4854 -2.0014 -3.0872 0.6705 1.0857 -520.3271 -432.1479 -0.9244 -1.0537
0.442 0.99 15100 0.4853 -2.0000 -3.0858 0.6715 1.0857 -520.1915 -432.0099 -0.9254 -1.0546
0.4814 0.99 15200 0.4854 -1.9998 -3.0852 0.6720 1.0854 -520.1320 -431.9893 -0.9286 -1.0581

Framework versions

  • PEFT 0.7.1
  • Transformers 4.36.2
  • Pytorch 2.1.2+cu121
  • Datasets 2.14.6
  • Tokenizers 0.15.2
Downloads last month
1
Unable to determine this model’s pipeline type. Check the docs .

Adapter for

Dataset used to train DUAL-GPO/zephyr-7b-dpo-qlora-v1