End of training
Browse files- README.md +30 -30
- final_checkpoint/model-00001-of-00003.safetensors +1 -1
- final_checkpoint/model-00002-of-00003.safetensors +1 -1
- final_checkpoint/model-00003-of-00003.safetensors +1 -1
- model-00001-of-00003.safetensors +1 -1
- model-00002-of-00003.safetensors +1 -1
- model-00003-of-00003.safetensors +1 -1
- training_args.bin +1 -1
README.md
CHANGED
@@ -17,15 +17,15 @@ should probably proofread and complete it, then remove this comment. -->
|
|
17 |
|
18 |
This model is a fine-tuned version of [mistralai/Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2) on an unknown dataset.
|
19 |
It achieves the following results on the evaluation set:
|
20 |
-
- Loss:
|
21 |
-
- Rewards/chosen: -
|
22 |
-
- Rewards/rejected: -5.
|
23 |
-
- Rewards/accuracies: 0.
|
24 |
-
- Rewards/margins:
|
25 |
-
- Logps/rejected: -85.
|
26 |
-
- Logps/chosen: -
|
27 |
-
- Logits/rejected: -
|
28 |
-
- Logits/chosen: -
|
29 |
|
30 |
## Model description
|
31 |
|
@@ -44,7 +44,7 @@ More information needed
|
|
44 |
### Training hyperparameters
|
45 |
|
46 |
The following hyperparameters were used during training:
|
47 |
-
- learning_rate: 1e-
|
48 |
- train_batch_size: 4
|
49 |
- eval_batch_size: 1
|
50 |
- seed: 42
|
@@ -59,26 +59,26 @@ The following hyperparameters were used during training:
|
|
59 |
|
60 |
| Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
|
61 |
|:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
|
62 |
-
|
|
63 |
-
|
|
64 |
-
|
|
65 |
-
|
|
66 |
-
|
|
67 |
-
|
|
68 |
-
|
|
69 |
-
|
|
70 |
-
|
|
71 |
-
|
|
72 |
-
|
|
73 |
-
|
|
74 |
-
|
|
75 |
-
|
|
76 |
-
|
|
77 |
-
|
|
78 |
-
|
|
79 |
-
|
|
80 |
-
|
|
81 |
-
|
|
82 |
|
83 |
|
84 |
### Framework versions
|
|
|
17 |
|
18 |
This model is a fine-tuned version of [mistralai/Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2) on an unknown dataset.
|
19 |
It achieves the following results on the evaluation set:
|
20 |
+
- Loss: 0.9939
|
21 |
+
- Rewards/chosen: -3.9532
|
22 |
+
- Rewards/rejected: -5.6547
|
23 |
+
- Rewards/accuracies: 0.6000
|
24 |
+
- Rewards/margins: 1.7015
|
25 |
+
- Logps/rejected: -85.1197
|
26 |
+
- Logps/chosen: -62.9180
|
27 |
+
- Logits/rejected: -2.0229
|
28 |
+
- Logits/chosen: -2.0243
|
29 |
|
30 |
## Model description
|
31 |
|
|
|
44 |
### Training hyperparameters
|
45 |
|
46 |
The following hyperparameters were used during training:
|
47 |
+
- learning_rate: 1e-06
|
48 |
- train_batch_size: 4
|
49 |
- eval_batch_size: 1
|
50 |
- seed: 42
|
|
|
59 |
|
60 |
| Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
|
61 |
|:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
|
62 |
+
| 0.6073 | 0.1 | 50 | 0.6623 | -1.2716 | -1.5743 | 0.5736 | 0.3026 | -44.3150 | -36.1020 | -2.8014 | -2.8019 |
|
63 |
+
| 0.7223 | 0.2 | 100 | 0.7934 | -3.0203 | -3.2538 | 0.5077 | 0.2336 | -61.1108 | -53.5883 | -2.4237 | -2.4243 |
|
64 |
+
| 0.8563 | 0.29 | 150 | 0.7580 | -1.8675 | -2.3470 | 0.5604 | 0.4795 | -52.0427 | -42.0607 | -2.5521 | -2.5529 |
|
65 |
+
| 0.7701 | 0.39 | 200 | 0.7631 | -1.8702 | -2.1583 | 0.5231 | 0.2882 | -50.1556 | -42.0875 | -2.7052 | -2.7056 |
|
66 |
+
| 0.8749 | 0.49 | 250 | 0.7941 | -2.4787 | -2.6066 | 0.4879 | 0.1279 | -54.6385 | -48.1731 | -2.8184 | -2.8189 |
|
67 |
+
| 0.6954 | 0.59 | 300 | 0.8039 | -1.5721 | -1.9872 | 0.5473 | 0.4151 | -48.4439 | -39.1064 | -2.8263 | -2.8268 |
|
68 |
+
| 0.733 | 0.68 | 350 | 0.7751 | -0.5753 | -1.0891 | 0.5253 | 0.5138 | -39.4632 | -29.1387 | -2.7587 | -2.7591 |
|
69 |
+
| 0.8256 | 0.78 | 400 | 0.7376 | -1.2950 | -1.7911 | 0.5516 | 0.4962 | -46.4838 | -36.3354 | -2.9702 | -2.9707 |
|
70 |
+
| 0.6485 | 0.88 | 450 | 0.7344 | -1.7798 | -2.3960 | 0.5692 | 0.6162 | -52.5322 | -41.1838 | -2.7167 | -2.7174 |
|
71 |
+
| 0.612 | 0.98 | 500 | 0.7051 | -1.3500 | -2.0968 | 0.5978 | 0.7467 | -49.5400 | -36.8863 | -2.5131 | -2.5138 |
|
72 |
+
| 0.2108 | 1.07 | 550 | 0.7799 | -2.0131 | -3.4580 | 0.6418 | 1.4449 | -63.1524 | -43.5171 | -2.2469 | -2.2482 |
|
73 |
+
| 0.1378 | 1.17 | 600 | 0.9314 | -3.4717 | -5.1214 | 0.6198 | 1.6497 | -79.7863 | -58.1027 | -1.9917 | -1.9933 |
|
74 |
+
| 0.188 | 1.27 | 650 | 0.9857 | -3.6647 | -5.3449 | 0.6198 | 1.6803 | -82.0219 | -60.0328 | -1.9585 | -1.9601 |
|
75 |
+
| 0.3739 | 1.37 | 700 | 1.0046 | -3.6506 | -5.3352 | 0.6176 | 1.6846 | -81.9245 | -59.8915 | -2.0334 | -2.0349 |
|
76 |
+
| 0.0428 | 1.46 | 750 | 0.9881 | -3.8094 | -5.4955 | 0.6088 | 1.6861 | -83.5278 | -61.4803 | -2.0272 | -2.0287 |
|
77 |
+
| 0.131 | 1.56 | 800 | 0.9900 | -3.9653 | -5.6306 | 0.6022 | 1.6653 | -84.8782 | -63.0390 | -2.0228 | -2.0242 |
|
78 |
+
| 0.1558 | 1.66 | 850 | 0.9943 | -3.9735 | -5.6628 | 0.6000 | 1.6893 | -85.2000 | -63.1207 | -2.0177 | -2.0191 |
|
79 |
+
| 0.1876 | 1.76 | 900 | 0.9939 | -3.9576 | -5.6566 | 0.6000 | 1.6989 | -85.1381 | -62.9622 | -2.0227 | -2.0241 |
|
80 |
+
| 0.1415 | 1.86 | 950 | 0.9945 | -3.9552 | -5.6536 | 0.6022 | 1.6984 | -85.1084 | -62.9377 | -2.0232 | -2.0246 |
|
81 |
+
| 0.1163 | 1.95 | 1000 | 0.9939 | -3.9532 | -5.6547 | 0.6000 | 1.7015 | -85.1197 | -62.9180 | -2.0229 | -2.0243 |
|
82 |
|
83 |
|
84 |
### Framework versions
|
final_checkpoint/model-00001-of-00003.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 4943162240
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:f71658322a22f7993778aadced1698aeca9f909bb9e62c62d0bc02eb274efbae
|
3 |
size 4943162240
|
final_checkpoint/model-00002-of-00003.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 4999819232
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:7fde66ed0238c834bd135141ecdaa4594b76a52d6c6871c3ec65437c1de63e1f
|
3 |
size 4999819232
|
final_checkpoint/model-00003-of-00003.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 4540516256
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:93f6894c0462f44b887be167ea2ad6f52a2b9ab82a7597ef40dad4dc4edd66f9
|
3 |
size 4540516256
|
model-00001-of-00003.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 4943162240
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:f71658322a22f7993778aadced1698aeca9f909bb9e62c62d0bc02eb274efbae
|
3 |
size 4943162240
|
model-00002-of-00003.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 4999819232
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:7fde66ed0238c834bd135141ecdaa4594b76a52d6c6871c3ec65437c1de63e1f
|
3 |
size 4999819232
|
model-00003-of-00003.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 4540516256
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:93f6894c0462f44b887be167ea2ad6f52a2b9ab82a7597ef40dad4dc4edd66f9
|
3 |
size 4540516256
|
training_args.bin
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 4475
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:2ff767c737ee9c42c29eccc3e10c3d48025113e4d40542f97a8e43bcc3f8836d
|
3 |
size 4475
|