jdang/openhermes-mistral-7b-dpo-gptq

Browse files

Files changed (5) hide show

README.md +14 -14
adapter_config.json +2 -2
adapter_model.safetensors +1 -1
runs/Jan20_00-27-19_025adacbeb60/events.out.tfevents.1705710535.025adacbeb60.563.0 +3 -0
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -17,15 +17,15 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [TheBloke/OpenHermes-2-Mistral-7B-GPTQ](https://huggingface.co/TheBloke/OpenHermes-2-Mistral-7B-GPTQ) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.6643
-- Rewards/chosen: 0.0014
-- Rewards/rejected: -0.0701
-- Rewards/accuracies: 0.8125
-- Rewards/margins: 0.0714
-- Logps/rejected: -216.5143
-- Logps/chosen: -215.7596
-- Logits/rejected: -2.5311
-- Logits/chosen: -2.5242
 ## Model description
@@ -58,11 +58,11 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
 |:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
-| 0.6845        | 0.01  | 10   | 0.6884          | 0.0077         | -0.0181          | 0.6875             | 0.0258          | -215.9950      | -215.6967    | -2.5278         | -2.5201       |
-| 0.7249        | 0.01  | 20   | 0.6905          | -0.0073        | -0.0256          | 0.75               | 0.0183          | -216.0695      | -215.8464    | -2.5330         | -2.5253       |
-| 0.6441        | 0.01  | 30   | 0.6808          | 0.0070         | -0.0255          | 0.75               | 0.0325          | -216.0689      | -215.7035    | -2.5350         | -2.5269       |
-| 0.6393        | 0.02  | 40   | 0.6657          | -0.0032        | -0.0731          | 0.875              | 0.0699          | -216.5449      | -215.8051    | -2.5327         | -2.5248       |
-| 0.6818        | 0.03  | 50   | 0.6643          | 0.0014         | -0.0701          | 0.8125             | 0.0714          | -216.5143      | -215.7596    | -2.5311         | -2.5242       |
 ### Framework versions

 This model is a fine-tuned version of [TheBloke/OpenHermes-2-Mistral-7B-GPTQ](https://huggingface.co/TheBloke/OpenHermes-2-Mistral-7B-GPTQ) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.6104
+- Rewards/chosen: -0.0458
+- Rewards/rejected: -0.4535
+- Rewards/accuracies: 0.6875
+- Rewards/margins: 0.4077
+- Logps/rejected: -390.3771
+- Logps/chosen: -149.5892
+- Logits/rejected: -1.3692
+- Logits/chosen: -1.4352
 ## Model description
 | Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
 |:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
+| 0.6865        | 0.01  | 10   | 0.6792          | -0.0093        | -0.0078          | 0.6875             | -0.0015         | -385.9200      | -149.2238    | -1.3698         | -1.4189       |
+| 0.6882        | 0.01  | 20   | 0.6660          | -0.0137        | -0.0526          | 0.625              | 0.0389          | -386.3681      | -149.2680    | -1.3729         | -1.4240       |
+| 0.6391        | 0.01  | 30   | 0.6446          | 0.0000         | -0.1131          | 0.625              | 0.1131          | -386.9731      | -149.1310    | -1.3737         | -1.4292       |
+| 0.639         | 0.02  | 40   | 0.6271          | -0.0337        | -0.2758          | 0.6875             | 0.2421          | -388.6000      | -149.4686    | -1.3729         | -1.4342       |
+| 0.6533        | 0.03  | 50   | 0.6104          | -0.0458        | -0.4535          | 0.6875             | 0.4077          | -390.3771      | -149.5892    | -1.3692         | -1.4352       |
 ### Framework versions

adapter_config.json CHANGED Viewed

@@ -19,8 +19,8 @@
   "rank_pattern": {},
   "revision": null,
   "target_modules": [
-    "q_proj",
-    "v_proj"
   ],
   "task_type": "CAUSAL_LM"
 }

   "rank_pattern": {},
   "revision": null,
   "target_modules": [
+    "v_proj",
+    "q_proj"
   ],
   "task_type": "CAUSAL_LM"
 }

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:bd728a3c24f4eeefa21c5c5fe3eda5cdd157640505696c5e78211b034d7c8412
 size 13648432

 version https://git-lfs.github.com/spec/v1
+oid sha256:8b49ade7c4cf516261beb948b85950f3fa2e2ddcbde5e64a5f6affa856914ab9
 size 13648432

runs/Jan20_00-27-19_025adacbeb60/events.out.tfevents.1705710535.025adacbeb60.563.0 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:8a193590be998655dcce0e3dfbae229af01b925b7565766f7dca02c4b9eeefcc
+size 12594

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:1f024c7aa0c4ea5720ac542d2212c8ced82fd18caf8829ed9bfbd5dc0d8e49b5
 size 4155

 version https://git-lfs.github.com/spec/v1
+oid sha256:55940e75baa40e4f90d801fd6e7d614e672d1ca4d898ec4755f67842cc0de741
 size 4155