Model save

Browse files

Files changed (9) hide show

README.md +20 -20
all_results.json +5 -5
model-00001-of-00003.safetensors +1 -1
model-00002-of-00003.safetensors +1 -1
model-00003-of-00003.safetensors +1 -1
runs/May10_12-21-48_n136-098-158/events.out.tfevents.1715315002.n136-098-158.1854935.0 +2 -2
train_results.json +5 -5
trainer_state.json +0 -0
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -15,15 +15,15 @@ should probably proofread and complete it, then remove this comment. -->
 This model was trained from scratch on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 1.8979
-- Rewards/chosen: -6.9869
-- Rewards/rejected: -8.4701
-- Rewards/accuracies: 0.6094
-- Rewards/margins: 1.4832
-- Logps/rejected: -1164.5387
-- Logps/chosen: -1010.4669
-- Logits/rejected: -0.5643
-- Logits/chosen: -0.7199
 ## Model description
@@ -58,17 +58,17 @@ The following hyperparameters were used during training:
 ### Training results
-| Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
-|:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
-| 0.2555        | 0.1   | 100  | 1.4172          | -4.8884        | -5.6701          | 0.5898             | 0.7817          | -884.5335      | -800.6121    | -1.3358         | -1.3942       |
-| 0.1854        | 0.21  | 200  | 1.6754          | -6.1508        | -7.3259          | 0.6211             | 1.1752          | -1050.1200     | -926.8517    | -1.1088         | -1.1853       |
-| 0.1799        | 0.31  | 300  | 1.5590          | -5.9157        | -6.9794          | 0.5977             | 1.0637          | -1015.4615     | -903.3419    | -1.0193         | -1.1110       |
-| 0.1679        | 0.42  | 400  | 2.1030          | -7.8503        | -9.2060          | 0.6094             | 1.3557          | -1238.1252     | -1096.8108   | -0.5753         | -0.7096       |
-| 0.1693        | 0.52  | 500  | 1.6563          | -6.3408        | -7.6718          | 0.625              | 1.3310          | -1084.7078     | -945.8611    | -0.8598         | -0.9873       |
-| 0.1609        | 0.63  | 600  | 1.6818          | -6.4795        | -7.7992          | 0.6211             | 1.3198          | -1097.4480     | -959.7227    | -0.4515         | -0.6164       |
-| 0.1559        | 0.73  | 700  | 1.9278          | -7.3485        | -8.7955          | 0.6133             | 1.4470          | -1197.0731     | -1046.6217   | -0.4166         | -0.5852       |
-| 0.1433        | 0.84  | 800  | 1.9050          | -7.1496        | -8.6252          | 0.6172             | 1.4756          | -1180.0403     | -1026.7318   | -0.5141         | -0.6745       |
-| 0.1479        | 0.94  | 900  | 1.8979          | -6.9869        | -8.4701          | 0.6094             | 1.4832          | -1164.5387     | -1010.4669   | -0.5643         | -0.7199       |
 ### Framework versions

 This model was trained from scratch on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 1.3947
+- Rewards/chosen: -2.4314
+- Rewards/rejected: -2.0023
+- Rewards/accuracies: 0.3867
+- Rewards/margins: -0.4292
+- Logps/rejected: -517.7516
+- Logps/chosen: -554.9180
+- Logits/rejected: -1.0823
+- Logits/chosen: -1.1239
 ## Model description
 ### Training results
+| Training Loss | Epoch | Step | Logits/chosen | Logits/rejected | Logps/chosen | Logps/rejected | Validation Loss | Rewards/accuracies | Rewards/chosen | Rewards/margins | Rewards/rejected |
+|:-------------:|:-----:|:----:|:-------------:|:---------------:|:------------:|:--------------:|:---------------:|:------------------:|:--------------:|:---------------:|:----------------:|
+| 0.3047        | 0.1   | 100  | -2.4405       | -2.3863         | -361.0801    | -337.7748      | 0.8551          | 0.3203             | -0.4930        | -0.2905         | -0.2025          |
+| 0.1861        | 0.21  | 200  | -1.5418       | -1.5107         | -450.2716    | -421.0934      | 1.0495          | 0.3867             | -1.3850        | -0.3493         | -1.0357          |
+| 0.1608        | 0.31  | 300  | -1.4367       | -1.4022         | -454.9446    | -422.9684      | 1.0910          | 0.3945             | -1.4317        | -0.3772         | -1.0544          |
+| 0.1368        | 0.42  | 400  | -1.0538       | -1.0131         | -520.1699    | -479.6456      | 1.3010          | 0.4102             | -2.0839        | -0.4627         | -1.6212          |
+| 0.1364        | 0.52  | 500  | -1.6466       | -1.6090         | -470.0934    | -430.8614      | 1.1773          | 0.3711             | -1.5832        | -0.4498         | -1.1334          |
+| 0.1223        | 0.63  | 600  | 1.3206        | -2.2971         | -1.8297      | 0.4141         | -0.4674         | -500.4930          | -541.4883      | -1.1541         | -1.1880          |
+| 0.0971        | 0.73  | 700  | 1.4638        | -2.6554         | -2.1594      | 0.3906         | -0.4959         | -533.4667          | -577.3128      | -0.9392         | -0.9712          |
+| 0.1035        | 0.84  | 800  | 1.4475        | -2.5761         | -2.1538      | 0.3945         | -0.4222         | -532.9068          | -569.3817      | -0.8902         | -0.9232          |
+| 0.088         | 0.94  | 900  | 1.3947        | -2.4314         | -2.0023      | 0.3867         | -0.4292         | -517.7516          | -554.9180      | -1.0823         | -1.1239          |
 ### Framework versions

all_results.json CHANGED Viewed

@@ -1,8 +1,8 @@
 {
     "epoch": 1.0,
-    "train_loss": 0.1961859940234279,
-    "train_runtime": 15468.9338,
-    "train_samples": 122270,
-    "train_samples_per_second": 7.904,
-    "train_steps_per_second": 0.062
 }

 {
     "epoch": 1.0,
+    "train_loss": 0.05187356284775659,
+    "train_runtime": 7314.4586,
+    "train_samples": 122268,
+    "train_samples_per_second": 16.716,
+    "train_steps_per_second": 0.131
 }

model-00001-of-00003.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:995180ab1bfccd03f3ed43211962300537543bf349fa09701bdfed2446682de0
 size 4943178720

 version https://git-lfs.github.com/spec/v1
+oid sha256:938e0d524d2c5b738205f45cbd32d084ef09b48fc55b42d2357ec2767acdf6c6
 size 4943178720

model-00002-of-00003.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:1bcfc7b87b51e023f9120c05ca16d79dc2363a9b1014b9e836f1d230e2035fc3
 size 4999819336

 version https://git-lfs.github.com/spec/v1
+oid sha256:1ec78470671379c923d72d3be323be44ae390376154e5d9fd23e55a6f9be5d44
 size 4999819336

model-00003-of-00003.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:93c22b80739cca4d51c122983391f1fc2fe409c9718974e8823aff56b6815dfe
 size 4540532728

 version https://git-lfs.github.com/spec/v1
+oid sha256:e68bdcd164f3a46f36753fb22642732b303167d2c92a5668db1be3f302bf4b44
 size 4540532728

runs/May10_12-21-48_n136-098-158/events.out.tfevents.1715315002.n136-098-158.1854935.0 CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:3194baf3f7b4da40a068d9a6ddf6c42210fe6a0907a962460e49f8c924dac98e
-size 35395

 version https://git-lfs.github.com/spec/v1
+oid sha256:12c7b7fa838a8b64c129bfdfbe99a6f5a620e4b04254e73d65b5d5412d77b2d2
+size 39189

train_results.json CHANGED Viewed

@@ -1,8 +1,8 @@
 {
     "epoch": 1.0,
-    "train_loss": 0.1961859940234279,
-    "train_runtime": 15468.9338,
-    "train_samples": 122270,
-    "train_samples_per_second": 7.904,
-    "train_steps_per_second": 0.062
 }

 {
     "epoch": 1.0,
+    "train_loss": 0.05187356284775659,
+    "train_runtime": 7314.4586,
+    "train_samples": 122268,
+    "train_samples_per_second": 16.716,
+    "train_steps_per_second": 0.131
 }

trainer_state.json CHANGED Viewed

The diff for this file is too large to render. See raw diff

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:291902e71a2a4585a72bed7e38f885273985e04787086b5e9c772897150379c6
 size 6264

 version https://git-lfs.github.com/spec/v1
+oid sha256:5ee00489c1463fae472326de1af05f14960d5729e1f2cab6ea21d656cab52f1b
 size 6264