RikkiXu commited on
Commit
d3fea2d
1 Parent(s): 28d9218

Model save

Browse files
README.md CHANGED
@@ -15,15 +15,15 @@ should probably proofread and complete it, then remove this comment. -->
15
 
16
  This model was trained from scratch on the None dataset.
17
  It achieves the following results on the evaluation set:
18
- - Loss: 2.0232
19
- - Rewards/chosen: -7.1946
20
- - Rewards/rejected: -8.7238
21
- - Rewards/accuracies: 0.6133
22
- - Rewards/margins: 1.5292
23
- - Logps/rejected: -1160.1461
24
- - Logps/chosen: -1001.0963
25
- - Logits/rejected: -0.4190
26
- - Logits/chosen: -0.5892
27
 
28
  ## Model description
29
 
@@ -58,17 +58,17 @@ The following hyperparameters were used during training:
58
 
59
  ### Training results
60
 
61
- | Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
62
- |:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
63
- | 0.254 | 0.1 | 100 | 1.4761 | -5.3290 | -6.2112 | 0.5898 | 0.8822 | -908.8818 | -814.5385 | -1.4873 | -1.5203 |
64
- | 0.1844 | 0.21 | 200 | 1.7253 | -6.1555 | -7.4481 | 0.6133 | 1.2926 | -1032.5726 | -897.1824 | -1.4103 | -1.4664 |
65
- | 0.1635 | 0.31 | 300 | 1.6677 | -6.1768 | -7.3921 | 0.5938 | 1.2153 | -1026.9750 | -899.3143 | -0.6257 | -0.7515 |
66
- | 0.1606 | 0.42 | 400 | 2.0307 | -7.0774 | -8.4601 | 0.6016 | 1.3827 | -1133.7700 | -989.3718 | -0.4798 | -0.6143 |
67
- | 0.163 | 0.52 | 500 | 1.8216 | -6.5495 | -8.0368 | 0.5898 | 1.4873 | -1091.4379 | -936.5793 | -0.8136 | -0.9380 |
68
- | 0.1656 | 0.63 | 600 | 1.8091 | -6.5309 | -7.9593 | 0.625 | 1.4284 | -1083.6920 | -934.7285 | -0.3700 | -0.5360 |
69
- | 0.1552 | 0.73 | 700 | 2.0767 | -7.6318 | -9.1866 | 0.5977 | 1.5547 | -1206.4197 | -1044.8179 | -0.3588 | -0.5351 |
70
- | 0.1377 | 0.84 | 800 | 2.0307 | -7.2870 | -8.8043 | 0.6055 | 1.5173 | -1168.1901 | -1010.3356 | -0.4156 | -0.5887 |
71
- | 0.1462 | 0.94 | 900 | 2.0232 | -7.1946 | -8.7238 | 0.6133 | 1.5292 | -1160.1461 | -1001.0963 | -0.4190 | -0.5892 |
72
 
73
 
74
  ### Framework versions
 
15
 
16
  This model was trained from scratch on the None dataset.
17
  It achieves the following results on the evaluation set:
18
+ - Logits/chosen: -0.7535
19
+ - Logits/rejected: -0.5762
20
+ - Logps/chosen: -1012.8752
21
+ - Logps/rejected: -1186.8893
22
+ - Loss: 2.0172
23
+ - Rewards/accuracies: 0.6211
24
+ - Rewards/chosen: -7.3124
25
+ - Rewards/margins: 1.6789
26
+ - Rewards/rejected: -8.9913
27
 
28
  ## Model description
29
 
 
58
 
59
  ### Training results
60
 
61
+ | Training Loss | Epoch | Step | Logits/chosen | Logits/rejected | Logps/chosen | Logps/rejected | Validation Loss | Rewards/accuracies | Rewards/chosen | Rewards/margins | Rewards/rejected |
62
+ |:-------------:|:-----:|:----:|:-------------:|:---------------:|:------------:|:--------------:|:---------------:|:------------------:|:--------------:|:---------------:|:----------------:|
63
+ | 0.2553 | 0.1 | 100 | -1.4311 | -1.3999 | -810.3989 | -903.4209 | 1.4885 | 0.5938 | -5.2877 | 0.8689 | -6.1566 |
64
+ | 0.1835 | 0.21 | 200 | -1.1732 | -1.0838 | -985.7894 | -1125.4023 | 1.7331 | 0.6406 | -7.0416 | 1.3349 | -8.3764 |
65
+ | 0.1686 | 0.31 | 300 | -0.4382 | -0.2631 | -962.0720 | -1124.1737 | 1.8406 | 0.6211 | -6.8044 | 1.5597 | -8.3641 |
66
+ | 0.1652 | 0.42 | 400 | -0.9962 | -0.8431 | -1031.8544 | -1193.3768 | 2.2100 | 0.6133 | -7.5022 | 1.5539 | -9.0562 |
67
+ | 0.1641 | 0.52 | 500 | -0.6576 | -0.4578 | -968.2329 | -1137.7808 | 1.7548 | 0.6211 | -6.8660 | 1.6342 | -8.5002 |
68
+ | 0.1644 | 0.63 | 600 | -0.6600 | -0.4861 | -931.4107 | -1085.6713 | 1.7738 | 0.6328 | -6.4978 | 1.4813 | -7.9791 |
69
+ | 0.1568 | 0.73 | 700 | -0.5415 | -0.3529 | -1035.0319 | -1212.3086 | 2.0363 | 0.6367 | -7.5340 | 1.7115 | -9.2455 |
70
+ | 0.1394 | 0.84 | 800 | -0.6702 | -0.4872 | -1030.3461 | -1205.7072 | 2.0376 | 0.6289 | -7.4871 | 1.6923 | -9.1795 |
71
+ | 0.1421 | 0.94 | 900 | -0.7535 | -0.5762 | -1012.8752 | -1186.8893 | 2.0172 | 0.6211 | -7.3124 | 1.6789 | -8.9913 |
72
 
73
 
74
  ### Framework versions
all_results.json CHANGED
@@ -1,8 +1,8 @@
1
  {
2
  "epoch": 1.0,
3
- "train_loss": 0.194384015168195,
4
- "train_runtime": 15417.2553,
5
- "train_samples": 122270,
6
- "train_samples_per_second": 7.931,
7
- "train_steps_per_second": 0.062
8
  }
 
1
  {
2
  "epoch": 1.0,
3
+ "train_loss": 0.1926751920690087,
4
+ "train_runtime": 814.3751,
5
+ "train_samples": 122268,
6
+ "train_samples_per_second": 150.137,
7
+ "train_steps_per_second": 1.173
8
  }
model-00001-of-00003.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:7376a50ac65872ba0f050dddd468c17c88ed63f7db168915cfc56727b44a7986
3
  size 4943178720
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ada5b7290d0922eb776e4ba7c3bac0506cbfc55b90de703c1c1b9078ef15e8c2
3
  size 4943178720
model-00002-of-00003.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:bbb7c97be51922fb040f2d98b5ff5071afea302437ca3fb23a3e471d57e69a9a
3
  size 4999819336
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4a5d5612736e3a036a0950e1836129483c21f374b02b958c3dc37207323c61fe
3
  size 4999819336
model-00003-of-00003.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:4f5fe56371c7bc0f27b56b5e0f0eedaa0dedc324888c26a2b74345406bbc0ba6
3
  size 4540532728
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:db7e1d1a0f47b6b4206707315bed80b309dc28ecf5f710a29d9412fd7573e154
3
  size 4540532728
runs/May11_01-40-52_n136-082-130/events.out.tfevents.1715363605.n136-082-130.679255.0 CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:42ea284f7f1aa301a4333f90089ef4dc7c78a1f7b7236610185897c00daa473e
3
- size 73954
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:202893c44a4a811f9f4be554609293311a386553d0eb28ad7ff4fc94d0e247d3
3
+ size 77748
runs/May11_06-27-51_n136-082-130/events.out.tfevents.1715380893.n136-082-130.803970.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:fbfd3aa0e2084b4fa44aa34968a622ae8b4106d3bcb1dce04433421501c6be70
3
+ size 8649
train_results.json CHANGED
@@ -1,8 +1,8 @@
1
  {
2
  "epoch": 1.0,
3
- "train_loss": 0.194384015168195,
4
- "train_runtime": 15417.2553,
5
- "train_samples": 122270,
6
- "train_samples_per_second": 7.931,
7
- "train_steps_per_second": 0.062
8
  }
 
1
  {
2
  "epoch": 1.0,
3
+ "train_loss": 0.1926751920690087,
4
+ "train_runtime": 814.3751,
5
+ "train_samples": 122268,
6
+ "train_samples_per_second": 150.137,
7
+ "train_steps_per_second": 1.173
8
  }
trainer_state.json CHANGED
The diff for this file is too large to render. See raw diff
 
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:6ae281a8958994a7bce439d9c20b58e6e6645068e54f28e9d838d5d683846cc2
3
  size 6200
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ed2056e9b8414b41a9fff131f35b4438e77e1609ffe55e8570c5c650ab634922
3
  size 6200