Model save

Browse files

Files changed (10) hide show

README.md +20 -20
all_results.json +5 -5
model-00001-of-00003.safetensors +1 -1
model-00002-of-00003.safetensors +1 -1
model-00003-of-00003.safetensors +1 -1
runs/May11_01-40-52_n136-082-130/events.out.tfevents.1715363605.n136-082-130.679255.0 +2 -2
runs/May11_06-27-51_n136-082-130/events.out.tfevents.1715380893.n136-082-130.803970.0 +3 -0
train_results.json +5 -5
trainer_state.json +0 -0
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -15,15 +15,15 @@ should probably proofread and complete it, then remove this comment. -->
 This model was trained from scratch on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 2.0232
-- Rewards/chosen: -7.1946
-- Rewards/rejected: -8.7238
-- Rewards/accuracies: 0.6133
-- Rewards/margins: 1.5292
-- Logps/rejected: -1160.1461
-- Logps/chosen: -1001.0963
-- Logits/rejected: -0.4190
-- Logits/chosen: -0.5892
 ## Model description
@@ -58,17 +58,17 @@ The following hyperparameters were used during training:
 ### Training results
-| Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
-|:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
-| 0.254         | 0.1   | 100  | 1.4761          | -5.3290        | -6.2112          | 0.5898             | 0.8822          | -908.8818      | -814.5385    | -1.4873         | -1.5203       |
-| 0.1844        | 0.21  | 200  | 1.7253          | -6.1555        | -7.4481          | 0.6133             | 1.2926          | -1032.5726     | -897.1824    | -1.4103         | -1.4664       |
-| 0.1635        | 0.31  | 300  | 1.6677          | -6.1768        | -7.3921          | 0.5938             | 1.2153          | -1026.9750     | -899.3143    | -0.6257         | -0.7515       |
-| 0.1606        | 0.42  | 400  | 2.0307          | -7.0774        | -8.4601          | 0.6016             | 1.3827          | -1133.7700     | -989.3718    | -0.4798         | -0.6143       |
-| 0.163         | 0.52  | 500  | 1.8216          | -6.5495        | -8.0368          | 0.5898             | 1.4873          | -1091.4379     | -936.5793    | -0.8136         | -0.9380       |
-| 0.1656        | 0.63  | 600  | 1.8091          | -6.5309        | -7.9593          | 0.625              | 1.4284          | -1083.6920     | -934.7285    | -0.3700         | -0.5360       |
-| 0.1552        | 0.73  | 700  | 2.0767          | -7.6318        | -9.1866          | 0.5977             | 1.5547          | -1206.4197     | -1044.8179   | -0.3588         | -0.5351       |
-| 0.1377        | 0.84  | 800  | 2.0307          | -7.2870        | -8.8043          | 0.6055             | 1.5173          | -1168.1901     | -1010.3356   | -0.4156         | -0.5887       |
-| 0.1462        | 0.94  | 900  | 2.0232          | -7.1946        | -8.7238          | 0.6133             | 1.5292          | -1160.1461     | -1001.0963   | -0.4190         | -0.5892       |
 ### Framework versions

 This model was trained from scratch on the None dataset.
 It achieves the following results on the evaluation set:
+- Logits/chosen: -0.7535
+- Logits/rejected: -0.5762
+- Logps/chosen: -1012.8752
+- Logps/rejected: -1186.8893
+- Loss: 2.0172
+- Rewards/accuracies: 0.6211
+- Rewards/chosen: -7.3124
+- Rewards/margins: 1.6789
+- Rewards/rejected: -8.9913
 ## Model description
 ### Training results
+| Training Loss | Epoch | Step | Logits/chosen | Logits/rejected | Logps/chosen | Logps/rejected | Validation Loss | Rewards/accuracies | Rewards/chosen | Rewards/margins | Rewards/rejected |
+|:-------------:|:-----:|:----:|:-------------:|:---------------:|:------------:|:--------------:|:---------------:|:------------------:|:--------------:|:---------------:|:----------------:|
+| 0.2553        | 0.1   | 100  | -1.4311       | -1.3999         | -810.3989    | -903.4209      | 1.4885          | 0.5938             | -5.2877        | 0.8689          | -6.1566          |
+| 0.1835        | 0.21  | 200  | -1.1732       | -1.0838         | -985.7894    | -1125.4023     | 1.7331          | 0.6406             | -7.0416        | 1.3349          | -8.3764          |
+| 0.1686        | 0.31  | 300  | -0.4382       | -0.2631         | -962.0720    | -1124.1737     | 1.8406          | 0.6211             | -6.8044        | 1.5597          | -8.3641          |
+| 0.1652        | 0.42  | 400  | -0.9962       | -0.8431         | -1031.8544   | -1193.3768     | 2.2100          | 0.6133             | -7.5022        | 1.5539          | -9.0562          |
+| 0.1641        | 0.52  | 500  | -0.6576       | -0.4578         | -968.2329    | -1137.7808     | 1.7548          | 0.6211             | -6.8660        | 1.6342          | -8.5002          |
+| 0.1644        | 0.63  | 600  | -0.6600       | -0.4861         | -931.4107    | -1085.6713     | 1.7738          | 0.6328             | -6.4978        | 1.4813          | -7.9791          |
+| 0.1568        | 0.73  | 700  | -0.5415       | -0.3529         | -1035.0319   | -1212.3086     | 2.0363          | 0.6367             | -7.5340        | 1.7115          | -9.2455          |
+| 0.1394        | 0.84  | 800  | -0.6702       | -0.4872         | -1030.3461   | -1205.7072     | 2.0376          | 0.6289             | -7.4871        | 1.6923          | -9.1795          |
+| 0.1421        | 0.94  | 900  | -0.7535       | -0.5762         | -1012.8752   | -1186.8893     | 2.0172          | 0.6211             | -7.3124        | 1.6789          | -8.9913          |
 ### Framework versions

all_results.json CHANGED Viewed

@@ -1,8 +1,8 @@
 {
     "epoch": 1.0,
-    "train_loss": 0.194384015168195,
-    "train_runtime": 15417.2553,
-    "train_samples": 122270,
-    "train_samples_per_second": 7.931,
-    "train_steps_per_second": 0.062
 }

 {
     "epoch": 1.0,
+    "train_loss": 0.1926751920690087,
+    "train_runtime": 814.3751,
+    "train_samples": 122268,
+    "train_samples_per_second": 150.137,
+    "train_steps_per_second": 1.173
 }

model-00001-of-00003.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:7376a50ac65872ba0f050dddd468c17c88ed63f7db168915cfc56727b44a7986
 size 4943178720

 version https://git-lfs.github.com/spec/v1
+oid sha256:ada5b7290d0922eb776e4ba7c3bac0506cbfc55b90de703c1c1b9078ef15e8c2
 size 4943178720

model-00002-of-00003.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:bbb7c97be51922fb040f2d98b5ff5071afea302437ca3fb23a3e471d57e69a9a
 size 4999819336

 version https://git-lfs.github.com/spec/v1
+oid sha256:4a5d5612736e3a036a0950e1836129483c21f374b02b958c3dc37207323c61fe
 size 4999819336

model-00003-of-00003.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:4f5fe56371c7bc0f27b56b5e0f0eedaa0dedc324888c26a2b74345406bbc0ba6
 size 4540532728

 version https://git-lfs.github.com/spec/v1
+oid sha256:db7e1d1a0f47b6b4206707315bed80b309dc28ecf5f710a29d9412fd7573e154
 size 4540532728

runs/May11_01-40-52_n136-082-130/events.out.tfevents.1715363605.n136-082-130.679255.0 CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:42ea284f7f1aa301a4333f90089ef4dc7c78a1f7b7236610185897c00daa473e
-size 73954

 version https://git-lfs.github.com/spec/v1
+oid sha256:202893c44a4a811f9f4be554609293311a386553d0eb28ad7ff4fc94d0e247d3
+size 77748

runs/May11_06-27-51_n136-082-130/events.out.tfevents.1715380893.n136-082-130.803970.0 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:fbfd3aa0e2084b4fa44aa34968a622ae8b4106d3bcb1dce04433421501c6be70
+size 8649

train_results.json CHANGED Viewed

@@ -1,8 +1,8 @@
 {
     "epoch": 1.0,
-    "train_loss": 0.194384015168195,
-    "train_runtime": 15417.2553,
-    "train_samples": 122270,
-    "train_samples_per_second": 7.931,
-    "train_steps_per_second": 0.062
 }

 {
     "epoch": 1.0,
+    "train_loss": 0.1926751920690087,
+    "train_runtime": 814.3751,
+    "train_samples": 122268,
+    "train_samples_per_second": 150.137,
+    "train_steps_per_second": 1.173
 }

trainer_state.json CHANGED Viewed

The diff for this file is too large to render. See raw diff

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:6ae281a8958994a7bce439d9c20b58e6e6645068e54f28e9d838d5d683846cc2
 size 6200

 version https://git-lfs.github.com/spec/v1
+oid sha256:ed2056e9b8414b41a9fff131f35b4438e77e1609ffe55e8570c5c650ab634922
 size 6200