Model save

Browse files

Files changed (13) hide show

README.md +19 -19
all_results.json +5 -5
config.json +1 -1
model-00001-of-00007.safetensors +1 -1
model-00002-of-00007.safetensors +1 -1
model-00003-of-00007.safetensors +1 -1
model-00004-of-00007.safetensors +1 -1
model-00005-of-00007.safetensors +1 -1
model-00006-of-00007.safetensors +1 -1
model-00007-of-00007.safetensors +1 -1
runs/Sep14_21-14-45_65ecb96dba42/events.out.tfevents.1726348544.65ecb96dba42.1985.0 +2 -2
train_results.json +5 -5
trainer_state.json +0 -0

README.md CHANGED Viewed

@@ -3,12 +3,10 @@ library_name: transformers
 license: gemma
 base_model: google/gemma-7b
 tags:
-- alignment-handbook
 - trl
 - orpo
 - generated_from_trainer
-datasets:
-- argilla/dpo-mix-7k
 model-index:
 - name: gemma-7b-orpo
   results: []
@@ -19,20 +17,20 @@ should probably proofread and complete it, then remove this comment. -->
 # gemma-7b-orpo
-This model is a fine-tuned version of [google/gemma-7b](https://huggingface.co/google/gemma-7b) on the argilla/dpo-mix-7k dataset.
 It achieves the following results on the evaluation set:
-- Loss: 1.4556
-- Rewards/chosen: -0.0513
-- Rewards/rejected: -0.0589
-- Rewards/accuracies: 0.5108
-- Rewards/margins: 0.0076
-- Logps/rejected: -1.1787
-- Logps/chosen: -1.0268
-- Logits/rejected: 312.9670
-- Logits/chosen: 340.5321
-- Nll Loss: 1.4096
-- Log Odds Ratio: -0.6928
-- Log Odds Chosen: 0.2398
 ## Model description
@@ -60,15 +58,17 @@ The following hyperparameters were used during training:
 - total_train_batch_size: 4
 - total_eval_batch_size: 4
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
-- lr_scheduler_type: cosine
 - lr_scheduler_warmup_steps: 100
-- num_epochs: 1
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen | Nll Loss | Log Odds Ratio | Log Odds Chosen |
 |:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|:--------:|:--------------:|:---------------:|
-| 1.3423        | 1.0   | 1259 | 1.4556          | -0.0513        | -0.0589          | 0.5108             | 0.0076          | -1.1787        | -1.0268      | 312.9670        | 340.5321      | 1.4096   | -0.6928        | 0.2398          |
 ### Framework versions

 license: gemma
 base_model: google/gemma-7b
 tags:
 - trl
 - orpo
+- alignment-handbook
 - generated_from_trainer
 model-index:
 - name: gemma-7b-orpo
   results: []
 # gemma-7b-orpo
+This model is a fine-tuned version of [google/gemma-7b](https://huggingface.co/google/gemma-7b) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 1.7559
+- Rewards/chosen: -0.0650
+- Rewards/rejected: -0.0764
+- Rewards/accuracies: 0.5971
+- Rewards/margins: 0.0114
+- Logps/rejected: -1.5282
+- Logps/chosen: -1.3004
+- Logits/rejected: 266.0260
+- Logits/chosen: 295.6202
+- Nll Loss: 1.6941
+- Log Odds Ratio: -0.6992
+- Log Odds Chosen: 0.3721
 ## Model description
 - total_train_batch_size: 4
 - total_eval_batch_size: 4
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: inverse_sqrt
 - lr_scheduler_warmup_steps: 100
+- num_epochs: 3
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen | Nll Loss | Log Odds Ratio | Log Odds Chosen |
 |:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|:--------:|:--------------:|:---------------:|
+| 1.3309        | 1.0   | 1259 | 1.4432          | -0.0513        | -0.0583          | 0.5468             | 0.0071          | -1.1666        | -1.0254      | 310.9833        | 338.2715      | 1.3964   | -0.7034        | 0.2119          |
+| 0.647         | 2.0   | 2518 | 1.4816          | -0.0529        | -0.0637          | 0.5899             | 0.0108          | -1.2742        | -1.0583      | 296.0398        | 324.3109      | 1.4304   | -0.6778        | 0.3416          |
+| 0.348         | 3.0   | 3777 | 1.7559          | -0.0650        | -0.0764          | 0.5971             | 0.0114          | -1.5282        | -1.3004      | 266.0260        | 295.6202      | 1.6941   | -0.6992        | 0.3721          |
 ### Framework versions

all_results.json CHANGED Viewed

@@ -1,5 +1,5 @@
 {
-    "epoch": 1.0,
     "eval_log_odds_chosen": 0.23976314067840576,
     "eval_log_odds_ratio": -0.6928443908691406,
     "eval_logits/chosen": 340.5321350097656,
@@ -17,9 +17,9 @@
     "eval_samples_per_second": 6.122,
     "eval_steps_per_second": 1.539,
     "total_flos": 0.0,
-    "train_loss": 1.8019611810861456,
-    "train_runtime": 4470.8327,
     "train_samples": 5034,
-    "train_samples_per_second": 1.126,
-    "train_steps_per_second": 0.282
 }

 {
+    "epoch": 3.0,
     "eval_log_odds_chosen": 0.23976314067840576,
     "eval_log_odds_ratio": -0.6928443908691406,
     "eval_logits/chosen": 340.5321350097656,
     "eval_samples_per_second": 6.122,
     "eval_steps_per_second": 1.539,
     "total_flos": 0.0,
+    "train_loss": 0.968865410152301,
+    "train_runtime": 16784.8411,
     "train_samples": 5034,
+    "train_samples_per_second": 0.9,
+    "train_steps_per_second": 0.225
 }

config.json CHANGED Viewed

@@ -24,6 +24,6 @@
   "rope_theta": 10000.0,
   "torch_dtype": "float32",
   "transformers_version": "4.44.2",
-  "use_cache": true,
   "vocab_size": 256000
 }

   "rope_theta": 10000.0,
   "torch_dtype": "float32",
   "transformers_version": "4.44.2",
+  "use_cache": false,
   "vocab_size": 256000
 }

model-00001-of-00007.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:bd4a83dd65f561dc2599e83a75a96e4671b22a8d44a91d13804b5817f11e479e
 size 4913707856

 version https://git-lfs.github.com/spec/v1
+oid sha256:6815176c0edeb1b2d9103c46cff62f3299bca9d17badb357f0717e1d64f4f850
 size 4913707856

model-00002-of-00007.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:68d35ef7b8ba18b485054edd5028a69b186f2e87090f1c967e18e2cb388e8fc2
 size 4932629336

 version https://git-lfs.github.com/spec/v1
+oid sha256:9e3e009f3c4d62c1637fb0e64040c282af20425a531cc4ac57428221cd7c55a3
 size 4932629336

model-00003-of-00007.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:66e9243a62c14bc7daa3053a11e94aba695043eff823657007d0b8e2e324c277
 size 4731277496

 version https://git-lfs.github.com/spec/v1
+oid sha256:b27b0695258c0c3d410aada3325a78ed501fbaad1baa55cc52cbb0bee6c384ee
 size 4731277496

model-00004-of-00007.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:ef3c0cc61768d3909d22bff6f43e2220b3106289dcedfd695ab49a6ecb25bf23
 size 4731277512

 version https://git-lfs.github.com/spec/v1
+oid sha256:372e11ace41fa94f9fd26e9a223de210716209246f21ce0b2acd9e83abd8f914
 size 4731277512

model-00005-of-00007.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:52de71d4dc579e43e815865e9f9465a60c65007415afd9126472335b75ad9de9
 size 4932629384

 version https://git-lfs.github.com/spec/v1
+oid sha256:dc17a3914b28c68cc1c5f8233edc906f780aab9a6aff51898157176c68831f70
 size 4932629384

model-00006-of-00007.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:33dca2a2d5480217ee4b7fe6d397046dc659209810e220c7748253954c463620
 size 4731277512

 version https://git-lfs.github.com/spec/v1
+oid sha256:dc709fd64a3528b12ac0bdb36f3bd88880b8422254d1f3363ee9251efd11bb03
 size 4731277512

model-00007-of-00007.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:5fc66395bcb69e88597c76b0b9e196185cacd6d03c2c416dc385ba07ab290111
 size 2818648664

 version https://git-lfs.github.com/spec/v1
+oid sha256:f4bddb1c97e71e9159aec2c5e5c5e861ebcca52b43f5233c13dff2a0ce736b1c
 size 2818648664

runs/Sep14_21-14-45_65ecb96dba42/events.out.tfevents.1726348544.65ecb96dba42.1985.0 CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:c968e761e1c166f0ef0f35ed4cfa4a021ee28d353f6e5adbf72a6dc721370c91
-size 656102

 version https://git-lfs.github.com/spec/v1
+oid sha256:8425184108ea6a9b4d16da1f627e3a7d614630ee878bdf6cb9866adb525bca72
+size 657365

train_results.json CHANGED Viewed

@@ -1,9 +1,9 @@
 {
-    "epoch": 1.0,
     "total_flos": 0.0,
-    "train_loss": 1.8019611810861456,
-    "train_runtime": 4470.8327,
     "train_samples": 5034,
-    "train_samples_per_second": 1.126,
-    "train_steps_per_second": 0.282
 }

 {
+    "epoch": 3.0,
     "total_flos": 0.0,
+    "train_loss": 0.968865410152301,
+    "train_runtime": 16784.8411,
     "train_samples": 5034,
+    "train_samples_per_second": 0.9,
+    "train_steps_per_second": 0.225
 }

trainer_state.json CHANGED Viewed

The diff for this file is too large to render. See raw diff