Edit model card

zephyr-7b-ipo-qlora-v0

This model is a fine-tuned version of alignment-handbook/zephyr-7b-sft-qlora on the HuggingFaceH4/ultrafeedback_binarized dataset. It achieves the following results on the evaluation set:

  • Loss: 1755.6567
  • Rewards/chosen: -0.1146
  • Rewards/rejected: -0.2709
  • Rewards/accuracies: 0.6635
  • Rewards/margins: 0.1563
  • Logps/rejected: -238.7033
  • Logps/chosen: -243.4698
  • Logits/rejected: -2.1575
  • Logits/chosen: -2.3447

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-06
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • distributed_type: multi-GPU
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 4
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss Rewards/chosen Rewards/rejected Rewards/accuracies Rewards/margins Logps/rejected Logps/chosen Logits/rejected Logits/chosen
2504.3189 0.01 100 2491.3911 0.0023 0.0014 0.4915 0.0009 -211.4747 -231.7778 -2.1595 -2.3479
2448.3082 0.01 200 2442.3884 0.0092 0.0031 0.5740 0.0061 -211.3044 -231.0850 -2.1583 -2.3464
2268.8053 0.02 300 2338.6184 0.0340 0.0148 0.5855 0.0192 -210.1324 -228.6008 -2.1598 -2.3480
1992.3912 0.03 400 2218.6030 0.0424 0.0045 0.6080 0.0379 -211.1646 -227.7647 -2.1598 -2.3477
1894.6148 0.03 500 2099.7009 0.0381 -0.0294 0.6090 0.0675 -214.5512 -228.1932 -2.1647 -2.3531
2250.0029 0.04 600 2123.8472 -0.1488 -0.2087 0.6100 0.0599 -232.4783 -246.8839 -2.1789 -2.3689
2168.8156 0.05 700 2054.6855 -0.0378 -0.1127 0.6205 0.0749 -222.8819 -235.7833 -2.2053 -2.3973
2370.3576 0.05 800 1997.0155 -0.0503 -0.1449 0.6140 0.0946 -226.0987 -237.0312 -2.2323 -2.4269
2219.9355 0.06 900 2038.8440 0.0273 -0.0582 0.6200 0.0855 -217.4353 -229.2796 -2.2201 -2.4121
2382.9531 0.07 1000 2008.1322 -0.0021 -0.0959 0.6215 0.0937 -221.1991 -232.2193 -2.1832 -2.3707
2155.5914 0.07 1100 2003.2480 -0.2053 -0.3001 0.6245 0.0949 -241.6257 -252.5312 -2.1067 -2.2893
2408.5309 0.08 1200 1978.7773 0.0387 -0.0554 0.6405 0.0942 -217.1562 -228.1301 -2.1426 -2.3282
2340.302 0.09 1300 1959.7281 0.0315 -0.0587 0.6445 0.0903 -217.4835 -228.8514 -2.1681 -2.3558
1628.1867 0.09 1400 1933.3713 -0.0287 -0.1379 0.6425 0.1092 -225.3989 -234.8743 -2.1491 -2.3343
1639.7521 0.1 1500 1932.8866 0.0042 -0.1095 0.6315 0.1137 -222.5614 -231.5808 -2.1305 -2.3128
2100.5828 0.1 1600 1981.9399 -0.0316 -0.1510 0.6245 0.1194 -226.7130 -235.1657 -2.1246 -2.3058
1859.6395 0.11 1700 1915.6974 -0.1198 -0.2263 0.6345 0.1066 -234.2446 -243.9821 -2.1079 -2.2892
1918.483 0.12 1800 1937.4408 -0.2171 -0.3099 0.6330 0.0927 -242.6006 -253.7192 -2.1282 -2.3127
1457.3569 0.12 1900 1911.1393 -0.1721 -0.3052 0.6430 0.1332 -242.1366 -249.2134 -2.1054 -2.2870
2512.6883 0.13 2000 1904.9348 -0.1102 -0.2461 0.6395 0.1359 -236.2243 -243.0244 -2.1032 -2.2844
2032.659 0.14 2100 1915.6288 -0.1119 -0.2359 0.6360 0.1240 -235.2046 -243.1993 -2.1217 -2.3055
2207.576 0.14 2200 1899.6105 -0.0187 -0.1441 0.6400 0.1255 -226.0232 -233.8711 -2.1026 -2.2840
1761.0438 0.15 2300 1910.9148 -0.0001 -0.1181 0.6365 0.1180 -223.4245 -232.0163 -2.1030 -2.2853
1459.2476 0.16 2400 1888.1434 -0.0668 -0.2003 0.6350 0.1335 -231.6436 -238.6845 -2.0794 -2.2609
1836.8891 0.16 2500 1875.6038 -0.1268 -0.2632 0.6375 0.1364 -237.9300 -244.6820 -2.0871 -2.2696
1885.9926 0.17 2600 1840.0220 -0.1835 -0.3281 0.6570 0.1446 -244.4172 -250.3530 -2.1104 -2.2955
1692.8492 0.18 2700 1861.0566 -0.1522 -0.2731 0.6550 0.1209 -238.9260 -247.2276 -2.1196 -2.3050
1884.091 0.18 2800 1863.0623 -0.0410 -0.1731 0.6440 0.1321 -228.9247 -236.1043 -2.0999 -2.2835
1507.8515 0.19 2900 1893.5802 -0.1903 -0.3166 0.6435 0.1262 -243.2698 -251.0389 -2.0429 -2.2201
1346.5843 0.2 3000 1877.9161 -0.0825 -0.2331 0.6505 0.1506 -234.9221 -240.2551 -2.0497 -2.2280
1925.325 0.2 3100 1869.2515 -0.0534 -0.1976 0.6435 0.1442 -231.3750 -237.3496 -2.0729 -2.2523
1560.9117 0.21 3200 1867.8296 -0.0958 -0.2346 0.6435 0.1388 -235.0729 -241.5852 -2.0943 -2.2768
1823.3514 0.22 3300 1848.3113 -0.1345 -0.2736 0.6430 0.1392 -238.9745 -245.4509 -2.0495 -2.2288
1617.63 0.22 3400 1866.4885 -0.1702 -0.3061 0.6415 0.1358 -242.2180 -249.0295 -2.0497 -2.2291
1438.7666 0.23 3500 1877.9581 -0.3109 -0.4536 0.6470 0.1426 -256.9674 -263.0986 -1.9701 -2.1436
2120.1508 0.24 3600 1857.2382 -0.2311 -0.3700 0.6505 0.1389 -248.6132 -255.1127 -1.9101 -2.0810
2384.8625 0.24 3700 1852.2377 -0.1579 -0.2979 0.6500 0.1400 -241.4033 -247.7925 -2.0020 -2.1780
1321.1524 0.25 3800 1872.4688 -0.1539 -0.2759 0.6470 0.1219 -239.1979 -247.3985 -2.0657 -2.2452
1975.567 0.26 3900 1848.4305 -0.0515 -0.2055 0.6555 0.1541 -232.1653 -237.1519 -2.0325 -2.2111
1914.4258 0.26 4000 1862.8907 -0.1395 -0.2767 0.6535 0.1372 -239.2816 -245.9548 -2.0401 -2.2187
2663.7783 0.27 4100 1853.1919 -0.1526 -0.2899 0.6535 0.1373 -240.6036 -247.2679 -2.0377 -2.2153
1816.1697 0.27 4200 1847.7046 -0.1257 -0.2646 0.6560 0.1389 -238.0683 -244.5728 -2.0271 -2.2035
2080.9717 0.28 4300 1834.1217 -0.1441 -0.2826 0.6550 0.1385 -239.8744 -246.4135 -2.0350 -2.2132
1718.8639 0.29 4400 1853.1575 -0.1808 -0.3348 0.6580 0.1539 -245.0880 -250.0867 -2.0238 -2.2019
1744.0963 0.29 4500 1841.0155 -0.2073 -0.3496 0.6555 0.1423 -246.5689 -252.7311 -2.0476 -2.2260
1741.5717 0.3 4600 1874.8073 -0.1364 -0.2532 0.6440 0.1168 -236.9356 -245.6457 -2.0375 -2.2154
1377.8717 0.31 4700 1864.4313 -0.1412 -0.2982 0.6500 0.1570 -241.4357 -246.1263 -2.0732 -2.2540
1664.0443 0.31 4800 1858.1903 -0.1020 -0.2517 0.6425 0.1497 -236.7814 -242.2027 -2.0735 -2.2518
1791.327 0.32 4900 1849.7412 -0.1771 -0.3244 0.6455 0.1473 -244.0522 -249.7118 -2.1022 -2.2827
1540.4815 0.33 5000 1836.6489 -0.1776 -0.3258 0.6495 0.1482 -244.1883 -249.7621 -2.1425 -2.3259
1773.1312 0.33 5100 1836.5369 -0.1486 -0.3044 0.6530 0.1558 -242.0567 -246.8674 -2.1456 -2.3287
1598.2573 0.34 5200 1844.1843 -0.1412 -0.2753 0.6420 0.1341 -239.1406 -246.1282 -2.1671 -2.3522
2181.6221 0.35 5300 1859.4578 -0.1634 -0.3164 0.6465 0.1529 -243.2481 -248.3481 -2.1607 -2.3463
1823.2314 0.35 5400 1852.0754 -0.1038 -0.2571 0.6540 0.1533 -237.3202 -242.3835 -2.1606 -2.3461
1672.017 0.36 5500 1837.0868 -0.1032 -0.2389 0.6500 0.1357 -235.5011 -242.3273 -2.1677 -2.3532
1779.0814 0.37 5600 1833.4418 -0.1005 -0.2407 0.6480 0.1402 -235.6827 -242.0591 -2.1607 -2.3457
2283.9326 0.37 5700 1836.9828 -0.0903 -0.2399 0.6560 0.1496 -235.6059 -241.0341 -2.1508 -2.3346
1948.3748 0.38 5800 1821.6624 -0.0879 -0.2388 0.6550 0.1509 -235.4922 -240.7903 -2.1613 -2.3468
2247.3281 0.39 5900 1835.3883 -0.0937 -0.2330 0.6540 0.1392 -234.9075 -241.3790 -2.1440 -2.3277
1582.7374 0.39 6000 1837.4578 -0.1967 -0.3376 0.6545 0.1408 -245.3686 -251.6784 -2.1374 -2.3216
2022.2906 0.4 6100 1846.5861 -0.2357 -0.3634 0.6510 0.1277 -247.9513 -255.5701 -2.1382 -2.3229
1575.1918 0.41 6200 1803.5978 -0.1455 -0.3031 0.6605 0.1576 -241.9267 -246.5572 -2.1333 -2.3175
1673.5182 0.41 6300 1812.6738 -0.1278 -0.2900 0.6565 0.1622 -240.6118 -244.7870 -2.1516 -2.3377
1820.5289 0.42 6400 1817.4550 -0.1355 -0.2915 0.6515 0.1560 -240.7580 -245.5502 -2.1188 -2.3016
1858.2232 0.43 6500 1821.5573 -0.1460 -0.2868 0.6510 0.1407 -240.2900 -246.6088 -2.1219 -2.3056
2088.2963 0.43 6600 1821.9972 -0.1364 -0.2740 0.6475 0.1375 -239.0106 -245.6490 -2.1621 -2.3492
1938.0561 0.44 6700 1810.6576 -0.1937 -0.3383 0.6490 0.1446 -245.4427 -251.3741 -2.1807 -2.3693
1672.7229 0.44 6800 1816.6731 -0.1883 -0.3305 0.6530 0.1422 -244.6622 -250.8365 -2.1640 -2.3512
1800.1967 0.45 6900 1805.9200 -0.1464 -0.3037 0.6515 0.1573 -241.9807 -246.6402 -2.1636 -2.3503
1767.4076 0.46 7000 1820.3805 -0.0676 -0.2107 0.6465 0.1431 -232.6849 -238.7679 -2.1441 -2.3294
1597.6004 0.46 7100 1808.8562 -0.0877 -0.2361 0.6500 0.1483 -235.2173 -240.7786 -2.1259 -2.3091
1694.7447 0.47 7200 1807.7433 -0.0694 -0.2082 0.6485 0.1389 -232.4332 -238.9400 -2.1365 -2.3213
1729.9865 0.48 7300 1804.1646 -0.1292 -0.2812 0.6575 0.1520 -239.7330 -244.9228 -2.1347 -2.3204
2149.3504 0.48 7400 1797.1915 -0.1684 -0.3177 0.6520 0.1493 -243.3855 -248.8470 -2.1265 -2.3111
1329.7865 0.49 7500 1795.6615 -0.1734 -0.3339 0.6585 0.1604 -244.9986 -249.3482 -2.1367 -2.3220
1424.3902 0.5 7600 1812.0533 -0.1370 -0.3042 0.6565 0.1673 -242.0367 -245.7009 -2.1274 -2.3109
1652.7855 0.5 7700 1805.7570 -0.1359 -0.3000 0.6525 0.1641 -241.6126 -245.5912 -2.1331 -2.3181
1540.4484 0.51 7800 1808.3578 -0.1572 -0.3103 0.6560 0.1531 -242.6407 -247.7226 -2.1168 -2.2997
1297.3996 0.52 7900 1798.0718 -0.1532 -0.3176 0.6500 0.1644 -243.3756 -247.3274 -2.1292 -2.3135
2462.926 0.52 8000 1784.8866 -0.1422 -0.2914 0.6555 0.1493 -240.7565 -246.2211 -2.1410 -2.3266
1812.3775 0.53 8100 1789.9877 -0.1253 -0.2639 0.6545 0.1386 -238.0041 -244.5349 -2.1356 -2.3208
1606.6738 0.54 8200 1797.2037 -0.0952 -0.2416 0.6460 0.1464 -235.7737 -241.5288 -2.1190 -2.3023
1769.8457 0.54 8300 1802.8728 -0.1072 -0.2516 0.6450 0.1445 -236.7761 -242.7215 -2.1232 -2.3069
1652.7957 0.55 8400 1804.0986 -0.1504 -0.2966 0.6460 0.1462 -241.2766 -247.0452 -2.1298 -2.3138
2091.1088 0.56 8500 1803.5408 -0.1569 -0.3175 0.6510 0.1606 -243.3613 -247.6962 -2.1372 -2.3225
1515.3847 0.56 8600 1784.4094 -0.1280 -0.2855 0.6560 0.1576 -240.1655 -244.8018 -2.1503 -2.3367
1773.4947 0.57 8700 1781.3252 -0.1485 -0.2962 0.6540 0.1477 -241.2365 -246.8561 -2.1572 -2.3440
1795.5312 0.58 8800 1785.9962 -0.1188 -0.2773 0.6585 0.1585 -239.3465 -243.8881 -2.1596 -2.3469
1880.782 0.58 8900 1782.6388 -0.1450 -0.3047 0.6545 0.1597 -242.0805 -246.5005 -2.1572 -2.3445
1695.9539 0.59 9000 1783.7203 -0.1129 -0.2791 0.6625 0.1662 -239.5215 -243.2946 -2.1612 -2.3487
1709.6678 0.6 9100 1782.3418 -0.1323 -0.2750 0.6590 0.1427 -239.1156 -245.2378 -2.1555 -2.3422
1829.7031 0.6 9200 1774.4548 -0.1386 -0.2925 0.6610 0.1538 -240.8571 -245.8683 -2.1593 -2.3467
1540.8942 0.61 9300 1782.9685 -0.1619 -0.3198 0.6570 0.1578 -243.5889 -248.1998 -2.1635 -2.3515
1477.422 0.62 9400 1785.5961 -0.2024 -0.3448 0.6525 0.1423 -246.0874 -252.2482 -2.1583 -2.3456
1495.3285 0.62 9500 1796.2745 -0.2120 -0.3619 0.6510 0.1499 -247.7974 -253.2010 -2.1587 -2.3461
1647.9816 0.63 9600 1805.8228 -0.1656 -0.3324 0.6540 0.1668 -244.8505 -248.5635 -2.1655 -2.3532
2050.0473 0.63 9700 1781.7717 -0.1726 -0.3257 0.6585 0.1531 -244.1856 -249.2694 -2.1626 -2.3501
1848.6725 0.64 9800 1796.3894 -0.1546 -0.3209 0.6610 0.1663 -243.7036 -247.4636 -2.1564 -2.3433
1784.2059 0.65 9900 1775.6644 -0.1329 -0.2869 0.6625 0.1540 -240.3027 -245.2966 -2.1599 -2.3469
1470.263 0.65 10000 1772.5385 -0.1364 -0.2911 0.6580 0.1547 -240.7244 -245.6487 -2.1575 -2.3443
1144.0941 0.66 10100 1776.5482 -0.1295 -0.2930 0.6615 0.1635 -240.9152 -244.9547 -2.1529 -2.3393
1890.1879 0.67 10200 1785.0319 -0.1876 -0.3331 0.6530 0.1455 -244.9171 -250.7639 -2.1441 -2.3297
1441.0404 0.67 10300 1778.0973 -0.1516 -0.3046 0.6550 0.1530 -242.0714 -247.1607 -2.1502 -2.3364
1606.4429 0.68 10400 1776.5415 -0.1177 -0.2715 0.6605 0.1538 -238.7640 -243.7791 -2.1507 -2.3370
2053.1404 0.69 10500 1771.5421 -0.0938 -0.2519 0.6620 0.1581 -236.8025 -241.3824 -2.1503 -2.3366
1666.0459 0.69 10600 1766.1060 -0.1016 -0.2592 0.6615 0.1576 -237.5320 -242.1644 -2.1512 -2.3377
2062.3629 0.7 10700 1763.6278 -0.1257 -0.2840 0.6635 0.1583 -240.0120 -244.5764 -2.1536 -2.3403
1241.0871 0.71 10800 1766.0503 -0.1195 -0.2774 0.6635 0.1579 -239.3553 -243.9581 -2.1600 -2.3478
1870.9098 0.71 10900 1764.2948 -0.1284 -0.2914 0.6620 0.1630 -240.7504 -244.8407 -2.1578 -2.3453
2322.8574 0.72 11000 1763.4559 -0.1314 -0.2848 0.6640 0.1534 -240.0938 -245.1432 -2.1528 -2.3398
1666.5447 0.73 11100 1763.6829 -0.1301 -0.2843 0.6590 0.1542 -240.0467 -245.0160 -2.1505 -2.3374
1670.8051 0.73 11200 1761.8049 -0.1281 -0.2829 0.6635 0.1547 -239.8986 -244.8195 -2.1521 -2.3394
1693.2752 0.74 11300 1757.5520 -0.1263 -0.2836 0.6650 0.1574 -239.9730 -244.6301 -2.1602 -2.3483
1789.823 0.75 11400 1758.2073 -0.1165 -0.2815 0.6700 0.1650 -239.7627 -243.6555 -2.1659 -2.3545
1808.4945 0.75 11500 1761.2283 -0.0922 -0.2485 0.6665 0.1564 -236.4669 -241.2205 -2.1687 -2.3576
1721.7291 0.76 11600 1762.3303 -0.1280 -0.2793 0.6615 0.1514 -239.5466 -244.8028 -2.1560 -2.3436
1471.2858 0.77 11700 1764.1798 -0.1383 -0.2894 0.6610 0.1511 -240.5538 -245.8327 -2.1572 -2.3449
1792.252 0.77 11800 1759.8867 -0.1212 -0.2769 0.6670 0.1557 -239.3020 -244.1205 -2.1683 -2.3568
2080.723 0.78 11900 1758.6500 -0.1110 -0.2693 0.6665 0.1583 -238.5435 -243.1097 -2.1673 -2.3557
1576.5392 0.79 12000 1760.5195 -0.0802 -0.2372 0.6670 0.1569 -235.3284 -240.0280 -2.1685 -2.3569
1299.8462 0.79 12100 1767.3917 -0.0787 -0.2307 0.6580 0.1520 -234.6803 -239.875 -2.1674 -2.3554
1648.0504 0.8 12200 1765.9279 -0.0887 -0.2411 0.6600 0.1524 -235.7217 -240.8781 -2.1653 -2.3530
1886.5662 0.8 12300 1764.2982 -0.0903 -0.2429 0.6615 0.1526 -235.9053 -241.0387 -2.1655 -2.3531
1838.3824 0.81 12400 1764.4016 -0.0944 -0.2425 0.6610 0.1481 -235.8634 -241.4447 -2.1671 -2.3548
1238.4372 0.82 12500 1760.1107 -0.0978 -0.2518 0.6610 0.1541 -236.7934 -241.7809 -2.1651 -2.3529
1967.9301 0.82 12600 1757.7245 -0.0944 -0.2506 0.6615 0.1561 -236.6711 -241.4498 -2.1653 -2.3530
1912.1277 0.83 12700 1757.3872 -0.0946 -0.2503 0.6620 0.1557 -236.6407 -241.4644 -2.1667 -2.3547
2416.5143 0.84 12800 1756.4773 -0.1081 -0.2621 0.6610 0.1541 -237.8257 -242.8119 -2.1635 -2.3512
1621.5686 0.84 12900 1755.8291 -0.1139 -0.2680 0.6650 0.1541 -238.4072 -243.3935 -2.1621 -2.3497
1645.2689 0.85 13000 1755.0060 -0.1126 -0.2658 0.6640 0.1532 -238.1906 -243.2641 -2.1603 -2.3477
2114.2795 0.86 13100 1753.9948 -0.1178 -0.2724 0.6650 0.1546 -238.8559 -243.7840 -2.1587 -2.3460
1719.5012 0.86 13200 1755.7775 -0.1072 -0.2643 0.6630 0.1570 -238.0386 -242.7276 -2.1607 -2.3481
2001.4379 0.87 13300 1754.6166 -0.1093 -0.2668 0.6635 0.1575 -238.2886 -242.9348 -2.1600 -2.3474
1796.4686 0.88 13400 1754.6705 -0.1121 -0.2687 0.6660 0.1565 -238.4784 -243.2174 -2.1594 -2.3468
1621.7527 0.88 13500 1753.5303 -0.1157 -0.2722 0.6640 0.1565 -238.8311 -243.5743 -2.1593 -2.3467
2175.7262 0.89 13600 1754.9380 -0.1189 -0.2747 0.6650 0.1558 -239.0788 -243.8902 -2.1585 -2.3458
2119.6848 0.9 13700 1754.2703 -0.1205 -0.2743 0.6655 0.1538 -239.0397 -244.0544 -2.1569 -2.3441
1448.3924 0.9 13800 1754.0623 -0.1175 -0.2731 0.6650 0.1556 -238.9212 -243.7544 -2.1582 -2.3455
1953.3191 0.91 13900 1754.0231 -0.1157 -0.2721 0.6650 0.1563 -238.8184 -243.5795 -2.1588 -2.3461
1684.251 0.92 14000 1754.7476 -0.1146 -0.2703 0.6635 0.1557 -238.6375 -243.4636 -2.1591 -2.3464
1545.3156 0.92 14100 1755.1064 -0.1155 -0.2712 0.6650 0.1557 -238.7297 -243.5513 -2.1574 -2.3446
1391.3224 0.93 14200 1756.5656 -0.1138 -0.2702 0.6615 0.1564 -238.6280 -243.3831 -2.1568 -2.3439
1588.1222 0.94 14300 1755.2308 -0.1150 -0.2712 0.6650 0.1561 -238.7296 -243.5083 -2.1580 -2.3452
1734.7881 0.94 14400 1755.4102 -0.1145 -0.2715 0.6660 0.1570 -238.7584 -243.4543 -2.1571 -2.3442
1655.0535 0.95 14500 1754.9426 -0.1140 -0.2711 0.6630 0.1571 -238.7228 -243.4064 -2.1575 -2.3447
1412.3269 0.96 14600 1756.2190 -0.1147 -0.2713 0.6655 0.1566 -238.7402 -243.4749 -2.1581 -2.3453
1504.9481 0.96 14700 1756.3967 -0.1149 -0.2713 0.6645 0.1564 -238.7450 -243.4981 -2.1568 -2.3440
1509.7718 0.97 14800 1755.3798 -0.1147 -0.2711 0.6625 0.1564 -238.7248 -243.4784 -2.1576 -2.3448
1881.0627 0.97 14900 1755.1472 -0.1146 -0.2710 0.6650 0.1564 -238.7102 -243.4660 -2.1580 -2.3452
1820.8113 0.98 15000 1754.8676 -0.1146 -0.2710 0.6660 0.1563 -238.7073 -243.4661 -2.1581 -2.3454
1512.1538 0.99 15100 1754.2145 -0.1146 -0.2709 0.6635 0.1563 -238.7005 -243.4615 -2.1581 -2.3453
1698.8312 0.99 15200 1755.1842 -0.1146 -0.2708 0.6630 0.1562 -238.6915 -243.4615 -2.1577 -2.3449

Framework versions

  • PEFT 0.7.1
  • Transformers 4.36.2
  • Pytorch 2.1.2+cu121
  • Datasets 2.14.6
  • Tokenizers 0.15.2
Downloads last month
4
Unable to determine this model’s pipeline type. Check the docs .

Adapter for

Dataset used to train DUAL-GPO/zephyr-7b-ipo-qlora-v0