RikkiXu commited on
Commit
a11c741
1 Parent(s): 8c001bc

Model save

Browse files
README.md CHANGED
@@ -15,15 +15,15 @@ should probably proofread and complete it, then remove this comment. -->
15
 
16
  This model was trained from scratch on the None dataset.
17
  It achieves the following results on the evaluation set:
18
- - Loss: 1.8979
19
- - Rewards/chosen: -6.9869
20
- - Rewards/rejected: -8.4701
21
- - Rewards/accuracies: 0.6094
22
- - Rewards/margins: 1.4832
23
- - Logps/rejected: -1164.5387
24
- - Logps/chosen: -1010.4669
25
- - Logits/rejected: -0.5643
26
- - Logits/chosen: -0.7199
27
 
28
  ## Model description
29
 
@@ -58,17 +58,17 @@ The following hyperparameters were used during training:
58
 
59
  ### Training results
60
 
61
- | Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
62
- |:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
63
- | 0.2555 | 0.1 | 100 | 1.4172 | -4.8884 | -5.6701 | 0.5898 | 0.7817 | -884.5335 | -800.6121 | -1.3358 | -1.3942 |
64
- | 0.1854 | 0.21 | 200 | 1.6754 | -6.1508 | -7.3259 | 0.6211 | 1.1752 | -1050.1200 | -926.8517 | -1.1088 | -1.1853 |
65
- | 0.1799 | 0.31 | 300 | 1.5590 | -5.9157 | -6.9794 | 0.5977 | 1.0637 | -1015.4615 | -903.3419 | -1.0193 | -1.1110 |
66
- | 0.1679 | 0.42 | 400 | 2.1030 | -7.8503 | -9.2060 | 0.6094 | 1.3557 | -1238.1252 | -1096.8108 | -0.5753 | -0.7096 |
67
- | 0.1693 | 0.52 | 500 | 1.6563 | -6.3408 | -7.6718 | 0.625 | 1.3310 | -1084.7078 | -945.8611 | -0.8598 | -0.9873 |
68
- | 0.1609 | 0.63 | 600 | 1.6818 | -6.4795 | -7.7992 | 0.6211 | 1.3198 | -1097.4480 | -959.7227 | -0.4515 | -0.6164 |
69
- | 0.1559 | 0.73 | 700 | 1.9278 | -7.3485 | -8.7955 | 0.6133 | 1.4470 | -1197.0731 | -1046.6217 | -0.4166 | -0.5852 |
70
- | 0.1433 | 0.84 | 800 | 1.9050 | -7.1496 | -8.6252 | 0.6172 | 1.4756 | -1180.0403 | -1026.7318 | -0.5141 | -0.6745 |
71
- | 0.1479 | 0.94 | 900 | 1.8979 | -6.9869 | -8.4701 | 0.6094 | 1.4832 | -1164.5387 | -1010.4669 | -0.5643 | -0.7199 |
72
 
73
 
74
  ### Framework versions
 
15
 
16
  This model was trained from scratch on the None dataset.
17
  It achieves the following results on the evaluation set:
18
+ - Loss: 1.3947
19
+ - Rewards/chosen: -2.4314
20
+ - Rewards/rejected: -2.0023
21
+ - Rewards/accuracies: 0.3867
22
+ - Rewards/margins: -0.4292
23
+ - Logps/rejected: -517.7516
24
+ - Logps/chosen: -554.9180
25
+ - Logits/rejected: -1.0823
26
+ - Logits/chosen: -1.1239
27
 
28
  ## Model description
29
 
 
58
 
59
  ### Training results
60
 
61
+ | Training Loss | Epoch | Step | Logits/chosen | Logits/rejected | Logps/chosen | Logps/rejected | Validation Loss | Rewards/accuracies | Rewards/chosen | Rewards/margins | Rewards/rejected |
62
+ |:-------------:|:-----:|:----:|:-------------:|:---------------:|:------------:|:--------------:|:---------------:|:------------------:|:--------------:|:---------------:|:----------------:|
63
+ | 0.3047 | 0.1 | 100 | -2.4405 | -2.3863 | -361.0801 | -337.7748 | 0.8551 | 0.3203 | -0.4930 | -0.2905 | -0.2025 |
64
+ | 0.1861 | 0.21 | 200 | -1.5418 | -1.5107 | -450.2716 | -421.0934 | 1.0495 | 0.3867 | -1.3850 | -0.3493 | -1.0357 |
65
+ | 0.1608 | 0.31 | 300 | -1.4367 | -1.4022 | -454.9446 | -422.9684 | 1.0910 | 0.3945 | -1.4317 | -0.3772 | -1.0544 |
66
+ | 0.1368 | 0.42 | 400 | -1.0538 | -1.0131 | -520.1699 | -479.6456 | 1.3010 | 0.4102 | -2.0839 | -0.4627 | -1.6212 |
67
+ | 0.1364 | 0.52 | 500 | -1.6466 | -1.6090 | -470.0934 | -430.8614 | 1.1773 | 0.3711 | -1.5832 | -0.4498 | -1.1334 |
68
+ | 0.1223 | 0.63 | 600 | 1.3206 | -2.2971 | -1.8297 | 0.4141 | -0.4674 | -500.4930 | -541.4883 | -1.1541 | -1.1880 |
69
+ | 0.0971 | 0.73 | 700 | 1.4638 | -2.6554 | -2.1594 | 0.3906 | -0.4959 | -533.4667 | -577.3128 | -0.9392 | -0.9712 |
70
+ | 0.1035 | 0.84 | 800 | 1.4475 | -2.5761 | -2.1538 | 0.3945 | -0.4222 | -532.9068 | -569.3817 | -0.8902 | -0.9232 |
71
+ | 0.088 | 0.94 | 900 | 1.3947 | -2.4314 | -2.0023 | 0.3867 | -0.4292 | -517.7516 | -554.9180 | -1.0823 | -1.1239 |
72
 
73
 
74
  ### Framework versions
all_results.json CHANGED
@@ -1,8 +1,8 @@
1
  {
2
  "epoch": 1.0,
3
- "train_loss": 0.1961859940234279,
4
- "train_runtime": 15468.9338,
5
- "train_samples": 122270,
6
- "train_samples_per_second": 7.904,
7
- "train_steps_per_second": 0.062
8
  }
 
1
  {
2
  "epoch": 1.0,
3
+ "train_loss": 0.05187356284775659,
4
+ "train_runtime": 7314.4586,
5
+ "train_samples": 122268,
6
+ "train_samples_per_second": 16.716,
7
+ "train_steps_per_second": 0.131
8
  }
model-00001-of-00003.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:995180ab1bfccd03f3ed43211962300537543bf349fa09701bdfed2446682de0
3
  size 4943178720
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:938e0d524d2c5b738205f45cbd32d084ef09b48fc55b42d2357ec2767acdf6c6
3
  size 4943178720
model-00002-of-00003.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:1bcfc7b87b51e023f9120c05ca16d79dc2363a9b1014b9e836f1d230e2035fc3
3
  size 4999819336
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1ec78470671379c923d72d3be323be44ae390376154e5d9fd23e55a6f9be5d44
3
  size 4999819336
model-00003-of-00003.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:93c22b80739cca4d51c122983391f1fc2fe409c9718974e8823aff56b6815dfe
3
  size 4540532728
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e68bdcd164f3a46f36753fb22642732b303167d2c92a5668db1be3f302bf4b44
3
  size 4540532728
runs/May10_12-21-48_n136-098-158/events.out.tfevents.1715315002.n136-098-158.1854935.0 CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:3194baf3f7b4da40a068d9a6ddf6c42210fe6a0907a962460e49f8c924dac98e
3
- size 35395
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:12c7b7fa838a8b64c129bfdfbe99a6f5a620e4b04254e73d65b5d5412d77b2d2
3
+ size 39189
train_results.json CHANGED
@@ -1,8 +1,8 @@
1
  {
2
  "epoch": 1.0,
3
- "train_loss": 0.1961859940234279,
4
- "train_runtime": 15468.9338,
5
- "train_samples": 122270,
6
- "train_samples_per_second": 7.904,
7
- "train_steps_per_second": 0.062
8
  }
 
1
  {
2
  "epoch": 1.0,
3
+ "train_loss": 0.05187356284775659,
4
+ "train_runtime": 7314.4586,
5
+ "train_samples": 122268,
6
+ "train_samples_per_second": 16.716,
7
+ "train_steps_per_second": 0.131
8
  }
trainer_state.json CHANGED
The diff for this file is too large to render. See raw diff
 
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:291902e71a2a4585a72bed7e38f885273985e04787086b5e9c772897150379c6
3
  size 6264
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5ee00489c1463fae472326de1af05f14960d5729e1f2cab6ea21d656cab52f1b
3
  size 6264