Model save

Browse files

Files changed (6) hide show

README.md +26 -29
adapter_config.json +4 -4
all_results.json +4 -17
train_results.json +4 -4
trainer_state.json +4 -4
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -1,11 +1,8 @@
 ---
 base_model: alignment-handbook/zephyr-7b-sft-full
-datasets:
-- HuggingFaceH4/ultrafeedback_binarized
 library_name: peft
 license: apache-2.0
 tags:
-- alignment-handbook
 - trl
 - dpo
 - generated_from_trainer
@@ -19,17 +16,17 @@ should probably proofread and complete it, then remove this comment. -->
 # zephyr-7b-dpo-lora-r16-20k
-This model is a fine-tuned version of [alignment-handbook/zephyr-7b-sft-full](https://huggingface.co/alignment-handbook/zephyr-7b-sft-full) on the HuggingFaceH4/ultrafeedback_binarized dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.5367
-- Rewards/chosen: -0.7912
-- Rewards/rejected: -1.4787
-- Rewards/accuracies: 0.7103
-- Rewards/margins: 0.6874
-- Logps/rejected: -395.8989
-- Logps/chosen: -362.3625
-- Logits/rejected: -2.5102
-- Logits/chosen: -2.5539
 ## Model description
@@ -62,26 +59,26 @@ The following hyperparameters were used during training:
 ### Training results
-| Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
-|:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
-| 0.6895        | 0.08  | 100  | 0.6896          | 0.0099         | 0.0028           | 0.6627             | 0.0072          | -247.7537      | -282.2447    | -2.8481         | -2.8901       |
-| 0.653         | 0.16  | 200  | 0.6569          | -0.0133        | -0.0954          | 0.6865             | 0.0821          | -257.5692      | -284.5635    | -2.8339         | -2.8742       |
-| 0.6385        | 0.24  | 300  | 0.6190          | -0.2742        | -0.4752          | 0.6905             | 0.2011          | -295.5536      | -310.6566    | -2.8031         | -2.8399       |
-| 0.5689        | 0.32  | 400  | 0.6027          | -0.2972        | -0.5719          | 0.6944             | 0.2747          | -305.2159      | -312.9573    | -2.8083         | -2.8437       |
-| 0.5689        | 0.4   | 500  | 0.5750          | -0.6614        | -1.0704          | 0.7242             | 0.4089          | -355.0662      | -349.3812    | -2.7152         | -2.7560       |
-| 0.5884        | 0.48  | 600  | 0.5479          | -0.6965        | -1.2708          | 0.7123             | 0.5743          | -375.1053      | -352.8877    | -2.6322         | -2.6724       |
-| 0.5366        | 0.56  | 700  | 0.5462          | -0.7254        | -1.3351          | 0.7123             | 0.6097          | -381.5439      | -355.7809    | -2.6144         | -2.6541       |
-| 0.542         | 0.64  | 800  | 0.5451          | -0.6920        | -1.2686          | 0.7262             | 0.5766          | -374.8915      | -352.4363    | -2.5757         | -2.6163       |
-| 0.5282        | 0.72  | 900  | 0.5412          | -0.7969        | -1.4275          | 0.7083             | 0.6306          | -390.7825      | -362.9279    | -2.5266         | -2.5716       |
-| 0.5873        | 0.8   | 1000 | 0.5369          | -0.8233        | -1.5128          | 0.7083             | 0.6894          | -399.3072      | -365.5720    | -2.5254         | -2.5693       |
-| 0.5152        | 0.88  | 1100 | 0.5384          | -0.7446        | -1.4196          | 0.7143             | 0.6749          | -389.9855      | -357.7025    | -2.5188         | -2.5620       |
-| 0.5213        | 0.96  | 1200 | 0.5370          | -0.7888        | -1.4748          | 0.7063             | 0.6860          | -395.5133      | -362.1219    | -2.5135         | -2.5568       |
 ### Framework versions
 - PEFT 0.12.0
 - Transformers 4.44.0
-- Pytorch 2.4.0+cu121
-- Datasets 2.20.0
 - Tokenizers 0.19.1

 ---
 base_model: alignment-handbook/zephyr-7b-sft-full
 library_name: peft
 license: apache-2.0
 tags:
 - trl
 - dpo
 - generated_from_trainer
 # zephyr-7b-dpo-lora-r16-20k
+This model is a fine-tuned version of [alignment-handbook/zephyr-7b-sft-full](https://huggingface.co/alignment-handbook/zephyr-7b-sft-full) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Logits/chosen: -2.5568
+- Logits/rejected: -2.5135
+- Logps/chosen: -362.1219
+- Logps/rejected: -395.5133
+- Loss: 0.5370
+- Rewards/accuracies: 0.7063
+- Rewards/chosen: -0.7888
+- Rewards/margins: 0.6860
+- Rewards/rejected: -1.4748
 ## Model description
 ### Training results
+| Training Loss | Epoch | Step | Logits/chosen | Logits/rejected | Logps/chosen | Logps/rejected | Validation Loss | Rewards/accuracies | Rewards/chosen | Rewards/margins | Rewards/rejected |
+|:-------------:|:-----:|:----:|:-------------:|:---------------:|:------------:|:--------------:|:---------------:|:------------------:|:--------------:|:---------------:|:----------------:|
+| 0.6895        | 0.08  | 100  | -2.8901       | -2.8481         | -282.2447    | -247.7537      | 0.6896          | 0.6627             | 0.0099         | 0.0072          | 0.0028           |
+| 0.653         | 0.16  | 200  | -2.8742       | -2.8339         | -284.5635    | -257.5692      | 0.6569          | 0.6865             | -0.0133        | 0.0821          | -0.0954          |
+| 0.6385        | 0.24  | 300  | -2.8399       | -2.8031         | -310.6566    | -295.5536      | 0.6190          | 0.6905             | -0.2742        | 0.2011          | -0.4752          |
+| 0.5689        | 0.32  | 400  | -2.8437       | -2.8083         | -312.9573    | -305.2159      | 0.6027          | 0.6944             | -0.2972        | 0.2747          | -0.5719          |
+| 0.5689        | 0.4   | 500  | -2.7560       | -2.7152         | -349.3812    | -355.0662      | 0.5750          | 0.7242             | -0.6614        | 0.4089          | -1.0704          |
+| 0.5884        | 0.48  | 600  | -2.6724       | -2.6322         | -352.8877    | -375.1053      | 0.5479          | 0.7123             | -0.6965        | 0.5743          | -1.2708          |
+| 0.5366        | 0.56  | 700  | -2.6541       | -2.6144         | -355.7809    | -381.5439      | 0.5462          | 0.7123             | -0.7254        | 0.6097          | -1.3351          |
+| 0.542         | 0.64  | 800  | -2.6163       | -2.5757         | -352.4363    | -374.8915      | 0.5451          | 0.7262             | -0.6920        | 0.5766          | -1.2686          |
+| 0.5282        | 0.72  | 900  | -2.5716       | -2.5266         | -362.9279    | -390.7825      | 0.5412          | 0.7083             | -0.7969        | 0.6306          | -1.4275          |
+| 0.5873        | 0.8   | 1000 | -2.5693       | -2.5254         | -365.5720    | -399.3072      | 0.5369          | 0.7083             | -0.8233        | 0.6894          | -1.5128          |
+| 0.5152        | 0.88  | 1100 | -2.5620       | -2.5188         | -357.7025    | -389.9855      | 0.5384          | 0.7143             | -0.7446        | 0.6749          | -1.4196          |
+| 0.5213        | 0.96  | 1200 | -2.5568       | -2.5135         | -362.1219    | -395.5133      | 0.5370          | 0.7063             | -0.7888        | 0.6860          | -1.4748          |
 ### Framework versions
 - PEFT 0.12.0
 - Transformers 4.44.0
+- Pytorch 2.1.2+cu118
+- Datasets 2.21.0
 - Tokenizers 0.19.1

adapter_config.json CHANGED Viewed

@@ -20,13 +20,13 @@
   "rank_pattern": {},
   "revision": null,
   "target_modules": [
     "q_proj",
-    "gate_proj",
     "k_proj",
-    "down_proj",
-    "o_proj",
     "v_proj",
-    "up_proj"
   ],
   "task_type": "CAUSAL_LM",
   "use_dora": false,

   "rank_pattern": {},
   "revision": null,
   "target_modules": [
+    "o_proj",
+    "down_proj",
+    "up_proj",
     "q_proj",
     "k_proj",
     "v_proj",
+    "gate_proj"
   ],
   "task_type": "CAUSAL_LM",
   "use_dora": false,

all_results.json CHANGED Viewed

@@ -1,22 +1,9 @@
 {
     "epoch": 1.0,
-    "eval_logits/chosen": -2.553934335708618,
-    "eval_logits/rejected": -2.510233163833618,
-    "eval_logps/chosen": -362.36248779296875,
-    "eval_logps/rejected": -395.8988952636719,
-    "eval_loss": 0.5367478132247925,
-    "eval_rewards/accuracies": 0.7103174328804016,
-    "eval_rewards/chosen": -0.7912437319755554,
-    "eval_rewards/margins": 0.6874489188194275,
-    "eval_rewards/rejected": -1.4786925315856934,
-    "eval_runtime": 165.9175,
-    "eval_samples": 500,
-    "eval_samples_per_second": 3.014,
-    "eval_steps_per_second": 0.38,
     "total_flos": 0.0,
-    "train_loss": 0.5874967678070069,
-    "train_runtime": 15864.6959,
     "train_samples": 20000,
-    "train_samples_per_second": 1.261,
-    "train_steps_per_second": 0.079
 }

 {
     "epoch": 1.0,
     "total_flos": 0.0,
+    "train_loss": 0.0,
+    "train_runtime": 0.007,
     "train_samples": 20000,
+    "train_samples_per_second": 2851328.348,
+    "train_steps_per_second": 178208.022
 }

train_results.json CHANGED Viewed

@@ -1,9 +1,9 @@
 {
     "epoch": 1.0,
     "total_flos": 0.0,
-    "train_loss": 0.5874967678070069,
-    "train_runtime": 15864.6959,
     "train_samples": 20000,
-    "train_samples_per_second": 1.261,
-    "train_steps_per_second": 0.079
 }

 {
     "epoch": 1.0,
     "total_flos": 0.0,
+    "train_loss": 0.0,
+    "train_runtime": 0.007,
     "train_samples": 20000,
+    "train_samples_per_second": 2851328.348,
+    "train_steps_per_second": 178208.022
 }

trainer_state.json CHANGED Viewed

@@ -3969,10 +3969,10 @@
       "epoch": 1.0,
       "step": 1250,
       "total_flos": 0.0,
-      "train_loss": 0.5874967678070069,
-      "train_runtime": 15864.6959,
-      "train_samples_per_second": 1.261,
-      "train_steps_per_second": 0.079
     }
   ],
   "logging_steps": 5,

       "epoch": 1.0,
       "step": 1250,
       "total_flos": 0.0,
+      "train_loss": 0.0,
+      "train_runtime": 0.007,
+      "train_samples_per_second": 2851328.348,
+      "train_steps_per_second": 178208.022
     }
   ],
   "logging_steps": 5,

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:32c17f9082a66b364c07bf1e41fcd94aa4247249b2741e5435f6bc9b78cdb3bb
 size 6200

 version https://git-lfs.github.com/spec/v1
+oid sha256:1be4d7d17cad82502d0bb662ad8f3519df18c33c47b46c125a7d71e70f88371f
 size 6200