End of training

Files changed (4) hide show

README.md CHANGED Viewed

@@ -18,7 +18,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [TheBloke/zephyr-7B-beta-GPTQ](https://huggingface.co/TheBloke/zephyr-7B-beta-GPTQ) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.3048
 ## Model description
@@ -44,15 +44,18 @@ The following hyperparameters were used during training:
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 1
-- training_steps: 200
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| 0.3611        | 0.01  | 100  | 0.3279          |
-| 0.2922        | 0.01  | 200  | 0.3048          |
 ### Framework versions

 This model is a fine-tuned version of [TheBloke/zephyr-7B-beta-GPTQ](https://huggingface.co/TheBloke/zephyr-7B-beta-GPTQ) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.2808
 ## Model description
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 1
+- training_steps: 500
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| 0.3461        | 0.01  | 100  | 0.3273          |
+| 0.2884        | 0.01  | 200  | 0.2983          |
+| 0.2665        | 0.02  | 300  | 0.2891          |
+| 0.2841        | 0.03  | 400  | 0.2835          |
+| 0.2649        | 0.03  | 500  | 0.2808          |
 ### Framework versions

adapter_config.json CHANGED Viewed

@@ -20,9 +20,9 @@
   "revision": null,
   "target_modules": [
     "k_proj",
     "v_proj",
-    "o_proj",
-    "q_proj"
   ],
   "task_type": "CAUSAL_LM",
   "use_rslora": false

   "revision": null,
   "target_modules": [
     "k_proj",
+    "q_proj",
     "v_proj",
+    "o_proj"
   ],
   "task_type": "CAUSAL_LM",
   "use_rslora": false

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:3a6bfc2359f525b1d96d5e0b99b5ac34d6ef141bba9d41799458db5c9cfdca73
 size 54560368

 version https://git-lfs.github.com/spec/v1
+oid sha256:df2533c761ba0a6ca4210270d7f0ee7d41b7d9d8c0204112db1a4a98e7df7a7b
 size 54560368

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:6698bc4d558a208dd37a1d05b7500e02e829cc79558d34e5bf75782b5f0c28db
 size 4728

 version https://git-lfs.github.com/spec/v1
+oid sha256:c94081332a1382a96ec32c52f7b47f74ddab3b431ce41c07e2ecee7b3140c901
 size 4728