shawgpt-ft-lr2e-05-wd0.01

Files changed (3) hide show

README.md CHANGED Viewed

@@ -16,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [TheBloke/Mistral-7B-Instruct-v0.2-GPTQ](https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.2-GPTQ) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 3.3895
 ## Model description
@@ -51,18 +51,18 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch   | Step | Validation Loss |
 |:-------------:|:-------:|:----:|:---------------:|
-| 4.189         | 0.9231  | 3    | 3.8148          |
-| 4.0985        | 1.8462  | 6    | 3.7451          |
-| 4.0272        | 2.7692  | 9    | 3.6804          |
-| 2.9575        | 4.0     | 13   | 3.6039          |
-| 3.8955        | 4.9231  | 16   | 3.5534          |
-| 3.8168        | 5.8462  | 19   | 3.5097          |
-| 3.7748        | 6.7692  | 22   | 3.4725          |
-| 2.788         | 8.0     | 26   | 3.4331          |
-| 3.6891        | 8.9231  | 29   | 3.4118          |
-| 3.6734        | 9.8462  | 32   | 3.3975          |
-| 3.6644        | 10.7692 | 35   | 3.3902          |
-| 0.8447        | 11.0769 | 36   | 3.3895          |
 ### Framework versions

 This model is a fine-tuned version of [TheBloke/Mistral-7B-Instruct-v0.2-GPTQ](https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.2-GPTQ) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 3.3804
 ## Model description
 | Training Loss | Epoch   | Step | Validation Loss |
 |:-------------:|:-------:|:----:|:---------------:|
+| 4.1779        | 0.9231  | 3    | 3.8026          |
+| 4.0882        | 1.8462  | 6    | 3.7331          |
+| 4.0172        | 2.7692  | 9    | 3.6688          |
+| 2.9507        | 4.0     | 13   | 3.5927          |
+| 3.8869        | 4.9231  | 16   | 3.5431          |
+| 3.8087        | 5.8462  | 19   | 3.4996          |
+| 3.768         | 6.7692  | 22   | 3.4629          |
+| 2.7826        | 8.0     | 26   | 3.4239          |
+| 3.6824        | 8.9231  | 29   | 3.4026          |
+| 3.6671        | 9.8462  | 32   | 3.3882          |
+| 3.6579        | 10.7692 | 35   | 3.3812          |
+| 0.8452        | 11.0769 | 36   | 3.3804          |
 ### Framework versions

runs/Oct22_09-52-11_99867d27916d/events.out.tfevents.1729590732.99867d27916d.882.27 ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:5c3d96a4cccd99d3cd9b8b3b194571897b2b758b3827d9ecb91508995f10fbc8
+size 11650

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:a6ae66c2d1a6d2a426cbbc84b2b6c96cae87b0f848065205efa6564d0194cee7
 size 5240

 version https://git-lfs.github.com/spec/v1
+oid sha256:046196edb25c5e9bded3350424bc9083e2e49b364edae8bb35ab1217ab49f251
 size 5240