Commit
•
f0eb78a
1
Parent(s):
309e492
Mojiz-DPO
Browse files- README.md +27 -17
- adapter_model.safetensors +1 -1
- training_args.bin +1 -1
README.md
CHANGED
@@ -17,14 +17,14 @@ should probably proofread and complete it, then remove this comment. -->
|
|
17 |
This model is a fine-tuned version of [ahmedabdelwahed/Mojiz-sft](https://huggingface.co/ahmedabdelwahed/Mojiz-sft) on the None dataset.
|
18 |
It achieves the following results on the evaluation set:
|
19 |
- Loss: 0.0000
|
20 |
-
- Rewards/chosen:
|
21 |
-
- Rewards/rejected: -
|
22 |
- Rewards/accuracies: 1.0
|
23 |
-
- Rewards/margins:
|
24 |
-
- Logps/rejected: -
|
25 |
-
- Logps/chosen: -
|
26 |
-
- Logits/rejected: -11.
|
27 |
-
- Logits/chosen: -12.
|
28 |
|
29 |
## Model description
|
30 |
|
@@ -50,22 +50,32 @@ The following hyperparameters were used during training:
|
|
50 |
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
51 |
- lr_scheduler_type: linear
|
52 |
- lr_scheduler_warmup_steps: 150
|
53 |
-
- training_steps:
|
54 |
|
55 |
### Training results
|
56 |
|
57 |
| Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
|
58 |
|:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
|
59 |
| 0.0017 | 0.41 | 100 | 0.0000 | 9.9359 | -3.7597 | 1.0 | 13.6956 | -78.5589 | -306.4413 | -11.4127 | -12.4541 |
|
60 |
-
| 0.0002 | 0.82 | 200 | 0.0000 | 14.
|
61 |
-
| 0.
|
62 |
-
| 0.0 | 1.63 | 400 | 0.0000 | 15.
|
63 |
-
| 0.0 | 2.04 | 500 | 0.0000 | 15.
|
64 |
-
| 0.0 | 2.45 | 600 | 0.0000 | 16.
|
65 |
-
| 0.0 | 2.86 | 700 | 0.0000 | 16.
|
66 |
-
| 0.0 | 3.27 | 800 | 0.0000 |
|
67 |
-
| 0.0 | 3.67 | 900 | 0.0000 |
|
68 |
-
| 0.0 | 4.08 | 1000 | 0.0000 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
69 |
|
70 |
|
71 |
### Framework versions
|
|
|
17 |
This model is a fine-tuned version of [ahmedabdelwahed/Mojiz-sft](https://huggingface.co/ahmedabdelwahed/Mojiz-sft) on the None dataset.
|
18 |
It achieves the following results on the evaluation set:
|
19 |
- Loss: 0.0000
|
20 |
+
- Rewards/chosen: 17.3685
|
21 |
+
- Rewards/rejected: -9.0879
|
22 |
- Rewards/accuracies: 1.0
|
23 |
+
- Rewards/margins: 26.4564
|
24 |
+
- Logps/rejected: -89.2152
|
25 |
+
- Logps/chosen: -291.5760
|
26 |
+
- Logits/rejected: -11.3625
|
27 |
+
- Logits/chosen: -12.3502
|
28 |
|
29 |
## Model description
|
30 |
|
|
|
50 |
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
51 |
- lr_scheduler_type: linear
|
52 |
- lr_scheduler_warmup_steps: 150
|
53 |
+
- training_steps: 2000
|
54 |
|
55 |
### Training results
|
56 |
|
57 |
| Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
|
58 |
|:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
|
59 |
| 0.0017 | 0.41 | 100 | 0.0000 | 9.9359 | -3.7597 | 1.0 | 13.6956 | -78.5589 | -306.4413 | -11.4127 | -12.4541 |
|
60 |
+
| 0.0002 | 0.82 | 200 | 0.0000 | 14.2220 | -5.8111 | 1.0 | 20.0331 | -82.6616 | -297.8690 | -11.3000 | -12.2683 |
|
61 |
+
| 0.0036 | 1.22 | 300 | 0.0000 | 14.9129 | -6.8369 | 1.0 | 21.7499 | -84.7133 | -296.4872 | -11.2647 | -12.2239 |
|
62 |
+
| 0.0 | 1.63 | 400 | 0.0000 | 15.5788 | -7.9152 | 1.0 | 23.4939 | -86.8698 | -295.1555 | -11.2175 | -12.1437 |
|
63 |
+
| 0.0 | 2.04 | 500 | 0.0000 | 15.9537 | -8.1483 | 1.0 | 24.1020 | -87.3360 | -294.4057 | -11.2256 | -12.1568 |
|
64 |
+
| 0.0 | 2.45 | 600 | 0.0000 | 16.3724 | -8.1224 | 1.0 | 24.4948 | -87.2842 | -293.5682 | -11.2611 | -12.2092 |
|
65 |
+
| 0.0 | 2.86 | 700 | 0.0000 | 16.2464 | -8.7026 | 1.0 | 24.9490 | -88.4446 | -293.8203 | -11.2351 | -12.1506 |
|
66 |
+
| 0.0 | 3.27 | 800 | 0.0000 | 17.0144 | -7.9818 | 1.0 | 24.9962 | -87.0030 | -292.2843 | -11.3359 | -12.3118 |
|
67 |
+
| 0.0 | 3.67 | 900 | 0.0000 | 17.0212 | -8.0028 | 1.0 | 25.0241 | -87.0451 | -292.2705 | -11.3356 | -12.3114 |
|
68 |
+
| 0.0 | 4.08 | 1000 | 0.0000 | 17.1508 | -8.0673 | 1.0 | 25.2182 | -87.1741 | -292.0114 | -11.3467 | -12.3293 |
|
69 |
+
| 0.0 | 4.49 | 1100 | 0.0000 | 17.1035 | -8.6586 | 1.0 | 25.7621 | -88.3567 | -292.1060 | -11.3427 | -12.3196 |
|
70 |
+
| 0.0 | 4.9 | 1200 | 0.0000 | 17.1191 | -8.6910 | 1.0 | 25.8101 | -88.4214 | -292.0748 | -11.3428 | -12.3192 |
|
71 |
+
| 0.0 | 5.31 | 1300 | 0.0000 | 17.2084 | -8.9388 | 1.0 | 26.1472 | -88.9170 | -291.8962 | -11.3530 | -12.3370 |
|
72 |
+
| 0.0 | 5.71 | 1400 | 0.0000 | 17.2160 | -9.1194 | 1.0 | 26.3354 | -89.2783 | -291.8811 | -11.3506 | -12.3312 |
|
73 |
+
| 0.0 | 6.12 | 1500 | 0.0000 | 17.2326 | -9.1376 | 1.0 | 26.3702 | -89.3146 | -291.8478 | -11.3494 | -12.3291 |
|
74 |
+
| 0.0002 | 6.53 | 1600 | 0.0000 | 17.2830 | -9.1192 | 1.0 | 26.4022 | -89.2778 | -291.7470 | -11.3555 | -12.3384 |
|
75 |
+
| 0.0 | 6.94 | 1700 | 0.0000 | 17.3062 | -9.1044 | 1.0 | 26.4107 | -89.2482 | -291.7006 | -11.3582 | -12.3429 |
|
76 |
+
| 0.0 | 7.35 | 1800 | 0.0000 | 17.3229 | -9.1143 | 1.0 | 26.4372 | -89.2681 | -291.6673 | -11.3586 | -12.3435 |
|
77 |
+
| 0.0 | 7.76 | 1900 | 0.0000 | 17.3411 | -9.1107 | 1.0 | 26.4518 | -89.2609 | -291.6309 | -11.3600 | -12.3457 |
|
78 |
+
| 0.0 | 8.16 | 2000 | 0.0000 | 17.3685 | -9.0879 | 1.0 | 26.4564 | -89.2152 | -291.5760 | -11.3625 | -12.3502 |
|
79 |
|
80 |
|
81 |
### Framework versions
|
adapter_model.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 7098016
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:cee3e1bfa215df1a3e688744448a3a01bed605da4d393df476ecddf79c902fc5
|
3 |
size 7098016
|
training_args.bin
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 4219
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:fe74c66fc54071506a581583a53f3e4402f016002b22d95b06e97f597f7273e5
|
3 |
size 4219
|