Mojiz-DPO

Browse files

Files changed (4) hide show

README.md +19 -49
adapter_config.json +2 -2
adapter_model.safetensors +1 -1
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -17,14 +17,14 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [ahmedabdelwahed/Mojiz-sft](https://huggingface.co/ahmedabdelwahed/Mojiz-sft) on the None dataset.
 It achieves the following results on the evaluation set:
 - Loss: 0.0000
-- Rewards/chosen: 22.2209
-- Rewards/rejected: -9.2802
 - Rewards/accuracies: 1.0
-- Rewards/margins: 31.5011
-- Logps/rejected: -89.5998
-- Logps/chosen: -281.8713
-- Logits/rejected: -11.8496
-- Logits/chosen: -12.9621
 ## Model description
@@ -43,59 +43,29 @@ More information needed
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 0.0003
 - train_batch_size: 4
 - eval_batch_size: 8
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 150
-- training_steps: 4000
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
 |:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
-| 0.0001        | 0.41  | 100  | 0.0000          | 13.9027        | -6.9708          | 1.0                | 20.8735         | -84.9811       | -298.5077    | -11.2670        | -12.1892      |
-| 0.0           | 0.82  | 200  | 0.0000          | 17.3896        | -7.8690          | 1.0                | 25.2586         | -86.7774       | -291.5338    | -11.3083        | -12.2870      |
-| 0.0005        | 1.22  | 300  | 0.0000          | 17.6416        | -8.8830          | 1.0                | 26.5246         | -88.8054       | -291.0298    | -11.3003        | -12.3102      |
-| 0.0           | 1.63  | 400  | 0.0000          | 17.8706        | -10.2554         | 1.0                | 28.1260         | -91.5502       | -290.5718    | -11.2508        | -12.2176      |
-| 0.0           | 2.04  | 500  | 0.0000          | 18.1057        | -10.4493         | 1.0                | 28.5550         | -91.9381       | -290.1017    | -11.2773        | -12.2732      |
-| 0.0           | 2.45  | 600  | 0.0000          | 18.1832        | -10.3990         | 1.0                | 28.5822         | -91.8375       | -289.9467    | -11.2881        | -12.2895      |
-| 0.0           | 2.86  | 700  | 0.0000          | 16.7929        | -12.9826         | 1.0                | 29.7756         | -97.0047       | -292.7272    | -11.2594        | -12.0447      |
-| 0.0           | 3.27  | 800  | 0.0000          | 21.2291        | -7.5837          | 1.0                | 28.8129         | -86.2069       | -283.8548    | -11.7781        | -12.9026      |
-| 0.0           | 3.67  | 900  | 0.0000          | 21.2500        | -7.6403          | 1.0                | 28.8902         | -86.3200       | -283.8131    | -11.7771        | -12.9005      |
-| 0.0           | 4.08  | 1000 | 0.0000          | 21.3141        | -8.0568          | 1.0                | 29.3709         | -87.1530       | -283.6848    | -11.7649        | -12.8725      |
-| 0.0           | 4.49  | 1100 | 0.0000          | 21.2289        | -8.8084          | 1.0                | 30.0373         | -88.6562       | -283.8553    | -11.7555        | -12.8438      |
-| 0.0           | 4.9   | 1200 | 0.0000          | 21.2556        | -8.8423          | 1.0                | 30.0979         | -88.7241       | -283.8019    | -11.7538        | -12.8404      |
-| 0.0           | 5.31  | 1300 | 0.0000          | 21.3255        | -9.0397          | 1.0                | 30.3653         | -89.1189       | -283.6620    | -11.7587        | -12.8441      |
-| 0.0           | 5.71  | 1400 | 0.0000          | 21.3586        | -9.0546          | 1.0                | 30.4132         | -89.1486       | -283.5958    | -11.7606        | -12.8458      |
-| 0.0           | 6.12  | 1500 | 0.0000          | 21.3779        | -9.0803          | 1.0                | 30.4582         | -89.2000       | -283.5573    | -11.7600        | -12.8442      |
-| 0.0001        | 6.53  | 1600 | 0.0000          | 21.4632        | -8.9069          | 1.0                | 30.3700         | -88.8531       | -283.3867    | -11.7795        | -12.8770      |
-| 0.0           | 6.94  | 1700 | 0.0000          | 21.4992        | -8.8450          | 1.0                | 30.3441         | -88.7293       | -283.3147    | -11.7894        | -12.8933      |
-| 0.0           | 7.35  | 1800 | 0.0000          | 21.5109        | -8.8690          | 1.0                | 30.3799         | -88.7774       | -283.2912    | -11.7896        | -12.8931      |
-| 0.0           | 7.76  | 1900 | 0.0000          | 21.5322        | -8.9067          | 1.0                | 30.4390         | -88.8529       | -283.2486    | -11.7905        | -12.8917      |
-| 0.0           | 8.16  | 2000 | 0.0000          | 21.5638        | -8.9260          | 1.0                | 30.4898         | -88.8915       | -283.1855    | -11.7910        | -12.8912      |
-| 0.0           | 8.57  | 2100 | 0.0000          | 21.5688        | -8.9323          | 1.0                | 30.5011         | -88.9041       | -283.1755    | -11.7910        | -12.8909      |
-| 0.0           | 8.98  | 2200 | 0.0000          | 21.5834        | -8.9761          | 1.0                | 30.5596         | -88.9917       | -283.1462    | -11.7917        | -12.8894      |
-| 0.0           | 9.39  | 2300 | 0.0000          | 21.7552        | -9.1179          | 1.0                | 30.8731         | -89.2752       | -282.8026    | -11.7852        | -12.8848      |
-| 0.0           | 9.8   | 2400 | 0.0000          | 21.7651        | -9.1413          | 1.0                | 30.9064         | -89.3221       | -282.7829    | -11.7850        | -12.8839      |
-| 0.0           | 10.2  | 2500 | 0.0000          | 21.7780        | -9.1572          | 1.0                | 30.9351         | -89.3538       | -282.7572    | -11.7861        | -12.8846      |
-| 0.0           | 10.61 | 2600 | 0.0000          | 21.8023        | -9.1708          | 1.0                | 30.9731         | -89.3810       | -282.7084    | -11.7872        | -12.8865      |
-| 0.0           | 11.02 | 2700 | 0.0000          | 21.8155        | -9.1793          | 1.0                | 30.9948         | -89.3980       | -282.6820    | -11.7879        | -12.8875      |
-| 0.0           | 11.43 | 2800 | 0.0000          | 21.9589        | -9.1294          | 1.0                | 31.0883         | -89.2983       | -282.3953    | -11.8192        | -12.9290      |
-| 0.0           | 11.84 | 2900 | 0.0000          | 21.9736        | -9.1440          | 1.0                | 31.1177         | -89.3275       | -282.3658    | -11.8210        | -12.9310      |
-| 0.0           | 12.24 | 3000 | 0.0000          | 21.9922        | -9.1667          | 1.0                | 31.1589         | -89.3729       | -282.3287    | -11.8189        | -12.9279      |
-| 0.0           | 12.65 | 3100 | 0.0000          | 21.9939        | -9.1697          | 1.0                | 31.1636         | -89.3788       | -282.3252    | -11.8189        | -12.9278      |
-| 0.0           | 13.06 | 3200 | 0.0000          | 22.0187        | -9.2172          | 1.0                | 31.2360         | -89.4739       | -282.2755    | -11.8188        | -12.9264      |
-| 0.0           | 13.47 | 3300 | 0.0000          | 22.0256        | -9.2325          | 1.0                | 31.2581         | -89.5044       | -282.2618    | -11.8194        | -12.9272      |
-| 0.0           | 13.88 | 3400 | 0.0000          | 22.0341        | -9.2495          | 1.0                | 31.2836         | -89.5385       | -282.2448    | -11.8192        | -12.9267      |
-| 0.0           | 14.29 | 3500 | 0.0000          | 22.0343        | -9.2506          | 1.0                | 31.2849         | -89.5407       | -282.2444    | -11.8192        | -12.9267      |
-| 0.0           | 14.69 | 3600 | 0.0000          | 22.2061        | -9.2685          | 1.0                | 31.4746         | -89.5764       | -281.9009    | -11.8502        | -12.9633      |
-| 0.0           | 15.1  | 3700 | 0.0000          | 22.2073        | -9.2691          | 1.0                | 31.4763         | -89.5776       | -281.8986    | -11.8503        | -12.9634      |
-| 0.0           | 15.51 | 3800 | 0.0000          | 22.2107        | -9.2755          | 1.0                | 31.4862         | -89.5905       | -281.8916    | -11.8501        | -12.9628      |
-| 0.0           | 15.92 | 3900 | 0.0000          | 22.2207        | -9.2795          | 1.0                | 31.5003         | -89.5985       | -281.8716    | -11.8493        | -12.9618      |
-| 0.0           | 16.33 | 4000 | 0.0000          | 22.2209        | -9.2802          | 1.0                | 31.5011         | -89.5998       | -281.8713    | -11.8496        | -12.9621      |
 ### Framework versions

 This model is a fine-tuned version of [ahmedabdelwahed/Mojiz-sft](https://huggingface.co/ahmedabdelwahed/Mojiz-sft) on the None dataset.
 It achieves the following results on the evaluation set:
 - Loss: 0.0000
+- Rewards/chosen: 16.5213
+- Rewards/rejected: -7.9440
 - Rewards/accuracies: 1.0
+- Rewards/margins: 24.4653
+- Logps/rejected: -86.9274
+- Logps/chosen: -293.2704
+- Logits/rejected: -11.2915
+- Logits/chosen: -12.2516
 ## Model description
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- learning_rate: 0.0001
 - train_batch_size: 4
 - eval_batch_size: 8
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 150
+- training_steps: 1000
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
 |:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
+| 0.0017        | 0.41  | 100  | 0.0000          | 9.9359         | -3.7597          | 1.0                | 13.6956         | -78.5589       | -306.4413    | -11.4127        | -12.4541      |
+| 0.0002        | 0.82  | 200  | 0.0000          | 14.1969        | -5.8004          | 1.0                | 19.9973         | -82.6403       | -297.9192    | -11.3000        | -12.2682      |
+| 0.0037        | 1.22  | 300  | 0.0000          | 14.8615        | -6.7633          | 1.0                | 21.6248         | -84.5661       | -296.5901    | -11.2673        | -12.2269      |
+| 0.0           | 1.63  | 400  | 0.0000          | 15.4935        | -7.6471          | 1.0                | 23.1406         | -86.3337       | -295.3261    | -11.2271        | -12.1591      |
+| 0.0           | 2.04  | 500  | 0.0000          | 15.8634        | -7.8871          | 1.0                | 23.7505         | -86.8136       | -294.5863    | -11.2316        | -12.1672      |
+| 0.0           | 2.45  | 600  | 0.0000          | 16.1624        | -7.8756          | 1.0                | 24.0380         | -86.7906       | -293.9882    | -11.2578        | -12.2052      |
+| 0.0           | 2.86  | 700  | 0.0000          | 16.1247        | -8.2229          | 1.0                | 24.3476         | -87.4853       | -294.0637    | -11.2414        | -12.1705      |
+| 0.0           | 3.27  | 800  | 0.0000          | 16.4219        | -7.9771          | 1.0                | 24.3989         | -86.9936       | -293.4693    | -11.2814        | -12.2344      |
+| 0.0           | 3.67  | 900  | 0.0000          | 16.4248        | -7.9873          | 1.0                | 24.4122         | -87.0141       | -293.4634    | -11.2812        | -12.2342      |
+| 0.0           | 4.08  | 1000 | 0.0000          | 16.5213        | -7.9440          | 1.0                | 24.4653         | -86.9274       | -293.2704    | -11.2915        | -12.2516      |
 ### Framework versions

adapter_config.json CHANGED Viewed

@@ -19,8 +19,8 @@
   "rank_pattern": {},
   "revision": null,
   "target_modules": [
-    "q",
-    "v"
   ],
   "task_type": "SEQ_2_SEQ_LM"
 }

   "rank_pattern": {},
   "revision": null,
   "target_modules": [
+    "v",
+    "q"
   ],
   "task_type": "SEQ_2_SEQ_LM"
 }

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:f513e6fc34d162dd916e0e7609a14c3ac1e5eb0cf18cfddee2ce1f8552584593
 size 7098016

 version https://git-lfs.github.com/spec/v1
+oid sha256:79535a0e2a032c784c61d90fcbb50b82a79fd7b9df89d4272ab43984c2421865
 size 7098016

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:efa77791e82ab6c2feaf4e534134b8b0528decaaa81cac2afa283f43e4fda446
 size 4219

 version https://git-lfs.github.com/spec/v1
+oid sha256:304e26e2abb03ba7eaa0fadfbe1f573864d1c879025abe293c6ffe715100b200
 size 4219