haidermasood99/dpo-final

Browse files

Files changed (5) hide show

README.md +16 -16
adapter_config.json +1 -1
adapter_model.safetensors +2 -2
runs/Jun10_16-35-12_008435c9551a/events.out.tfevents.1718037545.008435c9551a.1951.0 +3 -0
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -18,15 +18,15 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [TheBloke/OpenHermes-2-Mistral-7B-GPTQ](https://huggingface.co/TheBloke/OpenHermes-2-Mistral-7B-GPTQ) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.6557
-- Rewards/chosen: -0.0382
-- Rewards/rejected: -0.1408
-- Rewards/accuracies: 0.75
-- Rewards/margins: 0.1027
-- Logps/rejected: -129.7875
-- Logps/chosen: -131.0592
-- Logits/rejected: -2.0563
-- Logits/chosen: -2.3077
 ## Model description
@@ -57,13 +57,13 @@ The following hyperparameters were used during training:
 ### Training results
-| Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
-|:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
-| 0.6769        | 0.005 | 10   | 0.6888          | -0.0796        | -0.0900          | 0.4375             | 0.0104          | -129.2791      | -131.4733    | -2.0870         | -2.3346       |
-| 0.713         | 0.01  | 20   | 0.7072          | -0.1185        | -0.0997          | 0.4375             | -0.0188         | -129.3760      | -131.8626    | -2.0821         | -2.3307       |
-| 0.7127        | 0.015 | 30   | 0.6886          | 0.0301         | 0.0180           | 0.5                | 0.0121          | -128.1994      | -130.3766    | -2.0743         | -2.3175       |
-| 0.6974        | 0.02  | 40   | 0.6637          | 0.0450         | -0.0273          | 0.5                | 0.0722          | -128.6517      | -130.2277    | -2.0686         | -2.3128       |
-| 0.6725        | 0.025 | 50   | 0.6557          | -0.0382        | -0.1408          | 0.75               | 0.1027          | -129.7875      | -131.0592    | -2.0563         | -2.3077       |
 ### Framework versions

 This model is a fine-tuned version of [TheBloke/OpenHermes-2-Mistral-7B-GPTQ](https://huggingface.co/TheBloke/OpenHermes-2-Mistral-7B-GPTQ) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.7095
+- Rewards/chosen: -0.1860
+- Rewards/rejected: -0.3362
+- Rewards/accuracies: 0.4904
+- Rewards/margins: 0.1502
+- Logps/rejected: -269.4139
+- Logps/chosen: -269.0661
+- Logits/rejected: -2.0876
+- Logits/chosen: -2.1662
 ## Model description
 ### Training results
+| Training Loss | Epoch  | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
+|:-------------:|:------:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
+| 0.6952        | 0.0002 | 10   | 0.6717          | 0.1018         | 0.0250           | 0.5769             | 0.0769          | -265.8023      | -266.1874    | -2.1074         | -2.1866       |
+| 0.7473        | 0.0003 | 20   | 0.6787          | 0.0390         | -0.0403          | 0.5192             | 0.0793          | -266.4547      | -266.8159    | -2.1064         | -2.1840       |
+| 0.6557        | 0.0005 | 30   | 0.7320          | -0.2017        | -0.2789          | 0.4904             | 0.0772          | -268.8405      | -269.2226    | -2.0938         | -2.1716       |
+| 0.8058        | 0.0007 | 40   | 0.7174          | -0.2018        | -0.3209          | 0.4808             | 0.1192          | -269.2612      | -269.2236    | -2.0878         | -2.1663       |
+| 0.5939        | 0.0009 | 50   | 0.7095          | -0.1860        | -0.3362          | 0.4904             | 0.1502          | -269.4139      | -269.0661    | -2.0876         | -2.1662       |
 ### Framework versions

adapter_config.json CHANGED Viewed

@@ -1,7 +1,7 @@
 {
   "alpha_pattern": {},
   "auto_mapping": null,
-  "base_model_name_or_path": "TheBloke/OpenHermes-2-Mistral-7B-GPTQ",
   "bias": "none",
   "fan_in_fan_out": false,
   "inference_mode": true,

 {
   "alpha_pattern": {},
   "auto_mapping": null,
+  "base_model_name_or_path": null,
   "bias": "none",
   "fan_in_fan_out": false,
   "inference_mode": true,

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:c3bae9155c2d0d65d1239a10c829c15d6333e2ce877e07fc43d4a4c6f2ae214e
-size 13648432

 version https://git-lfs.github.com/spec/v1
+oid sha256:b47e80fd5cf74d5339d80a3cee2ef90cda0d8319fc226196cb8a2b08b807eaa0
+size 13650608

runs/Jun10_16-35-12_008435c9551a/events.out.tfevents.1718037545.008435c9551a.1951.0 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:35aa835a890fdbba8db9efab3778bff174cc286c23bfa9b49bd21a8412a162fc
+size 14251

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:2bdeac99f439d4313c7dc854de7c9b45af15b8eac9cc37e4a7c7fb81f1f5bbc5
 size 5307

 version https://git-lfs.github.com/spec/v1
+oid sha256:4f9e015dce46de952a76279728db4a625f622a94247431a72ef737a5255b424e
 size 5307