tsavage68 commited on
Commit
921670c
·
verified ·
1 Parent(s): 201e2c5

End of training

Browse files
README.md CHANGED
@@ -17,15 +17,15 @@ should probably proofread and complete it, then remove this comment. -->
17
 
18
  This model is a fine-tuned version of [mistralai/Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2) on an unknown dataset.
19
  It achieves the following results on the evaluation set:
20
- - Loss: 1.4935
21
- - Rewards/chosen: -6.2215
22
- - Rewards/rejected: -5.6448
23
- - Rewards/accuracies: 0.3626
24
- - Rewards/margins: -0.5767
25
- - Logps/rejected: -85.0207
26
- - Logps/chosen: -85.6008
27
- - Logits/rejected: -5.8605
28
- - Logits/chosen: -5.8604
29
 
30
  ## Model description
31
 
@@ -44,7 +44,7 @@ More information needed
44
  ### Training hyperparameters
45
 
46
  The following hyperparameters were used during training:
47
- - learning_rate: 1e-05
48
  - train_batch_size: 4
49
  - eval_batch_size: 1
50
  - seed: 42
@@ -59,26 +59,26 @@ The following hyperparameters were used during training:
59
 
60
  | Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
61
  |:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
62
- | 1.1357 | 0.1 | 50 | 1.1734 | -1.9602 | -1.6509 | 0.3582 | -0.3094 | -45.0812 | -42.9883 | -3.1053 | -3.1052 |
63
- | 1.7275 | 0.2 | 100 | 1.5539 | -4.8260 | -4.4502 | 0.3978 | -0.3758 | -73.0739 | -71.6456 | -2.7839 | -2.7839 |
64
- | 1.6716 | 0.29 | 150 | 1.4805 | -4.2682 | -3.8441 | 0.3890 | -0.4241 | -67.0136 | -66.0676 | -3.8634 | -3.8634 |
65
- | 1.9883 | 0.39 | 200 | 1.4624 | -4.1549 | -3.7121 | 0.3648 | -0.4429 | -65.6932 | -64.9352 | -4.6023 | -4.6023 |
66
- | 1.2968 | 0.49 | 250 | 1.4720 | -4.1636 | -3.7323 | 0.3802 | -0.4312 | -65.8957 | -65.0215 | -4.0699 | -4.0699 |
67
- | 1.5145 | 0.59 | 300 | 1.4656 | -4.1401 | -3.6836 | 0.3626 | -0.4564 | -65.4088 | -64.7864 | -4.8231 | -4.8231 |
68
- | 1.7123 | 0.68 | 350 | 1.4617 | -4.1237 | -3.6671 | 0.3670 | -0.4567 | -65.2432 | -64.6233 | -4.7696 | -4.7696 |
69
- | 1.295 | 0.78 | 400 | 1.4632 | -4.1764 | -3.7222 | 0.3714 | -0.4543 | -65.7941 | -65.1502 | -4.9799 | -4.9799 |
70
- | 1.405 | 0.88 | 450 | 1.4666 | -4.1922 | -3.7464 | 0.3714 | -0.4458 | -66.0363 | -65.3076 | -5.0856 | -5.0856 |
71
- | 1.9129 | 0.98 | 500 | 1.4701 | -4.2370 | -3.7742 | 0.3648 | -0.4628 | -66.3146 | -65.7560 | -5.1195 | -5.1195 |
72
- | 1.2959 | 1.07 | 550 | 1.4889 | -4.3597 | -3.8796 | 0.3692 | -0.4802 | -67.3681 | -66.9833 | -5.1899 | -5.1899 |
73
- | 1.2707 | 1.17 | 600 | 1.5193 | -4.6364 | -4.1231 | 0.3582 | -0.5133 | -69.8035 | -69.7498 | -5.9136 | -5.9136 |
74
- | 1.3242 | 1.27 | 650 | 1.5168 | -4.6159 | -4.1101 | 0.3538 | -0.5057 | -69.6739 | -69.5444 | -5.3603 | -5.3603 |
75
- | 1.397 | 1.37 | 700 | 2.1272 | -6.5216 | -6.2977 | 0.4022 | -0.2239 | -91.5493 | -88.6020 | -3.4923 | -3.4922 |
76
- | 1.3107 | 1.46 | 750 | 1.4798 | -4.5654 | -4.0673 | 0.3626 | -0.4981 | -69.2450 | -69.0399 | -5.4624 | -5.4624 |
77
- | 1.2491 | 1.56 | 800 | 1.4610 | -4.8769 | -4.3575 | 0.3648 | -0.5193 | -72.1476 | -72.1544 | -5.2893 | -5.2893 |
78
- | 1.3924 | 1.66 | 850 | 1.4805 | -5.8437 | -5.2709 | 0.3473 | -0.5728 | -81.2817 | -81.8233 | -5.6057 | -5.6058 |
79
- | 1.1725 | 1.76 | 900 | 1.4957 | -6.2498 | -5.6711 | 0.3626 | -0.5787 | -85.2834 | -85.8838 | -5.8532 | -5.8531 |
80
- | 1.2113 | 1.86 | 950 | 1.4937 | -6.2249 | -5.6485 | 0.3626 | -0.5763 | -85.0578 | -85.6343 | -5.8631 | -5.8630 |
81
- | 1.5057 | 1.95 | 1000 | 1.4935 | -6.2215 | -5.6448 | 0.3626 | -0.5767 | -85.0207 | -85.6008 | -5.8605 | -5.8604 |
82
 
83
 
84
  ### Framework versions
 
17
 
18
  This model is a fine-tuned version of [mistralai/Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2) on an unknown dataset.
19
  It achieves the following results on the evaluation set:
20
+ - Loss: 0.9939
21
+ - Rewards/chosen: -3.9532
22
+ - Rewards/rejected: -5.6547
23
+ - Rewards/accuracies: 0.6000
24
+ - Rewards/margins: 1.7015
25
+ - Logps/rejected: -85.1197
26
+ - Logps/chosen: -62.9180
27
+ - Logits/rejected: -2.0229
28
+ - Logits/chosen: -2.0243
29
 
30
  ## Model description
31
 
 
44
  ### Training hyperparameters
45
 
46
  The following hyperparameters were used during training:
47
+ - learning_rate: 1e-06
48
  - train_batch_size: 4
49
  - eval_batch_size: 1
50
  - seed: 42
 
59
 
60
  | Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
61
  |:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
62
+ | 0.6073 | 0.1 | 50 | 0.6623 | -1.2716 | -1.5743 | 0.5736 | 0.3026 | -44.3150 | -36.1020 | -2.8014 | -2.8019 |
63
+ | 0.7223 | 0.2 | 100 | 0.7934 | -3.0203 | -3.2538 | 0.5077 | 0.2336 | -61.1108 | -53.5883 | -2.4237 | -2.4243 |
64
+ | 0.8563 | 0.29 | 150 | 0.7580 | -1.8675 | -2.3470 | 0.5604 | 0.4795 | -52.0427 | -42.0607 | -2.5521 | -2.5529 |
65
+ | 0.7701 | 0.39 | 200 | 0.7631 | -1.8702 | -2.1583 | 0.5231 | 0.2882 | -50.1556 | -42.0875 | -2.7052 | -2.7056 |
66
+ | 0.8749 | 0.49 | 250 | 0.7941 | -2.4787 | -2.6066 | 0.4879 | 0.1279 | -54.6385 | -48.1731 | -2.8184 | -2.8189 |
67
+ | 0.6954 | 0.59 | 300 | 0.8039 | -1.5721 | -1.9872 | 0.5473 | 0.4151 | -48.4439 | -39.1064 | -2.8263 | -2.8268 |
68
+ | 0.733 | 0.68 | 350 | 0.7751 | -0.5753 | -1.0891 | 0.5253 | 0.5138 | -39.4632 | -29.1387 | -2.7587 | -2.7591 |
69
+ | 0.8256 | 0.78 | 400 | 0.7376 | -1.2950 | -1.7911 | 0.5516 | 0.4962 | -46.4838 | -36.3354 | -2.9702 | -2.9707 |
70
+ | 0.6485 | 0.88 | 450 | 0.7344 | -1.7798 | -2.3960 | 0.5692 | 0.6162 | -52.5322 | -41.1838 | -2.7167 | -2.7174 |
71
+ | 0.612 | 0.98 | 500 | 0.7051 | -1.3500 | -2.0968 | 0.5978 | 0.7467 | -49.5400 | -36.8863 | -2.5131 | -2.5138 |
72
+ | 0.2108 | 1.07 | 550 | 0.7799 | -2.0131 | -3.4580 | 0.6418 | 1.4449 | -63.1524 | -43.5171 | -2.2469 | -2.2482 |
73
+ | 0.1378 | 1.17 | 600 | 0.9314 | -3.4717 | -5.1214 | 0.6198 | 1.6497 | -79.7863 | -58.1027 | -1.9917 | -1.9933 |
74
+ | 0.188 | 1.27 | 650 | 0.9857 | -3.6647 | -5.3449 | 0.6198 | 1.6803 | -82.0219 | -60.0328 | -1.9585 | -1.9601 |
75
+ | 0.3739 | 1.37 | 700 | 1.0046 | -3.6506 | -5.3352 | 0.6176 | 1.6846 | -81.9245 | -59.8915 | -2.0334 | -2.0349 |
76
+ | 0.0428 | 1.46 | 750 | 0.9881 | -3.8094 | -5.4955 | 0.6088 | 1.6861 | -83.5278 | -61.4803 | -2.0272 | -2.0287 |
77
+ | 0.131 | 1.56 | 800 | 0.9900 | -3.9653 | -5.6306 | 0.6022 | 1.6653 | -84.8782 | -63.0390 | -2.0228 | -2.0242 |
78
+ | 0.1558 | 1.66 | 850 | 0.9943 | -3.9735 | -5.6628 | 0.6000 | 1.6893 | -85.2000 | -63.1207 | -2.0177 | -2.0191 |
79
+ | 0.1876 | 1.76 | 900 | 0.9939 | -3.9576 | -5.6566 | 0.6000 | 1.6989 | -85.1381 | -62.9622 | -2.0227 | -2.0241 |
80
+ | 0.1415 | 1.86 | 950 | 0.9945 | -3.9552 | -5.6536 | 0.6022 | 1.6984 | -85.1084 | -62.9377 | -2.0232 | -2.0246 |
81
+ | 0.1163 | 1.95 | 1000 | 0.9939 | -3.9532 | -5.6547 | 0.6000 | 1.7015 | -85.1197 | -62.9180 | -2.0229 | -2.0243 |
82
 
83
 
84
  ### Framework versions
final_checkpoint/model-00001-of-00003.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:c3cd7dc04dafcf29fcdd207ebf365d9d2f63905aeae16637207ec2fc27a474ae
3
  size 4943162240
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f71658322a22f7993778aadced1698aeca9f909bb9e62c62d0bc02eb274efbae
3
  size 4943162240
final_checkpoint/model-00002-of-00003.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:dbc50bba8998d2145e1116345b401a8bc47a83a1730180fdde3162d5613d264c
3
  size 4999819232
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7fde66ed0238c834bd135141ecdaa4594b76a52d6c6871c3ec65437c1de63e1f
3
  size 4999819232
final_checkpoint/model-00003-of-00003.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:176470d6f63c17f2158ceb116f90b0ef8f773a26813bede1b0c080f4558c562d
3
  size 4540516256
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:93f6894c0462f44b887be167ea2ad6f52a2b9ab82a7597ef40dad4dc4edd66f9
3
  size 4540516256
model-00001-of-00003.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:c3cd7dc04dafcf29fcdd207ebf365d9d2f63905aeae16637207ec2fc27a474ae
3
  size 4943162240
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f71658322a22f7993778aadced1698aeca9f909bb9e62c62d0bc02eb274efbae
3
  size 4943162240
model-00002-of-00003.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:dbc50bba8998d2145e1116345b401a8bc47a83a1730180fdde3162d5613d264c
3
  size 4999819232
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7fde66ed0238c834bd135141ecdaa4594b76a52d6c6871c3ec65437c1de63e1f
3
  size 4999819232
model-00003-of-00003.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:176470d6f63c17f2158ceb116f90b0ef8f773a26813bede1b0c080f4558c562d
3
  size 4540516256
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:93f6894c0462f44b887be167ea2ad6f52a2b9ab82a7597ef40dad4dc4edd66f9
3
  size 4540516256
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:d89a6822d3c795aef994f45b3556d00869d4c0fb44cfc4557e0c6d3ad877c1b7
3
  size 4475
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2ff767c737ee9c42c29eccc3e10c3d48025113e4d40542f97a8e43bcc3f8836d
3
  size 4475