shawgpt-ft-lr2e-05-wd0.001

Files changed (3) hide show

README.md CHANGED Viewed

@@ -16,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [TheBloke/Mistral-7B-Instruct-v0.2-GPTQ](https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.2-GPTQ) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 3.8377
 ## Model description
@@ -51,18 +51,18 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch   | Step | Validation Loss |
 |:-------------:|:-------:|:----:|:---------------:|
-| 4.6425        | 0.9231  | 3    | 4.2073          |
-| 4.5674        | 1.8462  | 6    | 4.1547          |
-| 4.5171        | 2.7692  | 9    | 4.1042          |
-| 3.3351        | 4.0     | 13   | 4.0386          |
-| 4.4006        | 4.9231  | 16   | 3.9922          |
-| 4.3239        | 5.8462  | 19   | 3.9508          |
-| 4.2857        | 6.7692  | 22   | 3.9157          |
-| 3.1631        | 8.0     | 26   | 3.8788          |
-| 4.1916        | 8.9231  | 29   | 3.8585          |
-| 4.1817        | 9.8462  | 32   | 3.8452          |
-| 4.1675        | 10.7692 | 35   | 3.8383          |
-| 0.9797        | 11.0769 | 36   | 3.8377          |
 ### Framework versions

 This model is a fine-tuned version of [TheBloke/Mistral-7B-Instruct-v0.2-GPTQ](https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.2-GPTQ) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 3.8462
 ## Model description
 | Training Loss | Epoch   | Step | Validation Loss |
 |:-------------:|:-------:|:----:|:---------------:|
+| 4.6427        | 0.9231  | 3    | 4.2081          |
+| 4.5686        | 1.8462  | 6    | 4.1558          |
+| 4.5186        | 2.7692  | 9    | 4.1062          |
+| 3.3366        | 4.0     | 13   | 4.0413          |
+| 4.4034        | 4.9231  | 16   | 3.9956          |
+| 4.3273        | 5.8462  | 19   | 3.9553          |
+| 4.2905        | 6.7692  | 22   | 3.9216          |
+| 3.1678        | 8.0     | 26   | 3.8862          |
+| 4.199         | 8.9231  | 29   | 3.8667          |
+| 4.1896        | 9.8462  | 32   | 3.8536          |
+| 4.1758        | 10.7692 | 35   | 3.8468          |
+| 0.9823        | 11.0769 | 36   | 3.8462          |
 ### Framework versions

runs/Oct22_11-03-11_99867d27916d/events.out.tfevents.1729594992.99867d27916d.882.49 ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:1ab4eb46b70197dabc4127456bdd611d1ec7bf3e81d11c177f4bea94795cc952
+size 11654

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:712cfad7806a42f67efe38713c812c12da04742473bc95d73cb61a086339171b
 size 5240

 version https://git-lfs.github.com/spec/v1
+oid sha256:5e496dff0c51826fc313b51e843a985d4e8efefebd37a0f8e47fea523360cd21
 size 5240