End of training

Browse files

Files changed (4) hide show

README.md +23 -22
adapter_config.json +5 -5
adapter_model.safetensors +1 -1
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -20,8 +20,8 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [google/gemma-2b](https://huggingface.co/google/gemma-2b) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.4736
-- Accuracy: 0.8
 ## Model description
@@ -46,32 +46,33 @@ The following hyperparameters were used during training:
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - training_steps: 200
 ### Training results
 | Training Loss | Epoch  | Step | Validation Loss | Accuracy |
 |:-------------:|:------:|:----:|:---------------:|:--------:|
-| 0.7109        | 0.2941 | 10   | 0.7341          | 0.3684   |
-| 0.7148        | 0.5882 | 20   | 0.7228          | 0.3947   |
-| 0.6992        | 0.8824 | 30   | 0.7039          | 0.4895   |
-| 0.668         | 1.1765 | 40   | 0.6780          | 0.5789   |
-| 0.6328        | 1.4706 | 50   | 0.6501          | 0.6474   |
-| 0.6211        | 1.7647 | 60   | 0.6144          | 0.6737   |
-| 0.5781        | 2.0588 | 70   | 0.5833          | 0.6842   |
-| 0.5156        | 2.3529 | 80   | 0.5591          | 0.7      |
-| 0.5352        | 2.6471 | 90   | 0.5355          | 0.7368   |
-| 0.5508        | 2.9412 | 100  | 0.5179          | 0.7421   |
-| 0.543         | 3.2353 | 110  | 0.4917          | 0.7579   |
-| 0.4141        | 3.5294 | 120  | 0.4833          | 0.7684   |
-| 0.4102        | 3.8235 | 130  | 0.4706          | 0.7737   |
-| 0.4316        | 4.1176 | 140  | 0.4643          | 0.7789   |
-| 0.4844        | 4.4118 | 150  | 0.4683          | 0.8      |
-| 0.4199        | 4.7059 | 160  | 0.4668          | 0.8      |
-| 0.4082        | 5.0    | 170  | 0.4795          | 0.7842   |
-| 0.3516        | 5.2941 | 180  | 0.4804          | 0.7842   |
-| 0.4238        | 5.5882 | 190  | 0.4826          | 0.7947   |
-| 0.3867        | 5.8824 | 200  | 0.4736          | 0.8      |
 ### Framework versions

 This model is a fine-tuned version of [google/gemma-2b](https://huggingface.co/google/gemma-2b) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.4238
+- Accuracy: 0.7778
 ## Model description
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
+- lr_scheduler_warmup_steps: 20
 - training_steps: 200
 ### Training results
 | Training Loss | Epoch  | Step | Validation Loss | Accuracy |
 |:-------------:|:------:|:----:|:---------------:|:--------:|
+| 0.7031        | 0.1087 | 10   | 0.6740          | 0.5752   |
+| 0.6719        | 0.2174 | 20   | 0.6701          | 0.6046   |
+| 0.6602        | 0.3261 | 30   | 0.6497          | 0.6601   |
+| 0.6367        | 0.4348 | 40   | 0.6113          | 0.7059   |
+| 0.6172        | 0.5435 | 50   | 0.5686          | 0.7320   |
+| 0.5625        | 0.6522 | 60   | 0.5276          | 0.7451   |
+| 0.5664        | 0.7609 | 70   | 0.4938          | 0.7451   |
+| 0.5859        | 0.8696 | 80   | 0.4651          | 0.7712   |
+| 0.5           | 0.9783 | 90   | 0.4560          | 0.7647   |
+| 0.5898        | 1.0870 | 100  | 0.4560          | 0.7582   |
+| 0.5664        | 1.1957 | 110  | 0.4459          | 0.7516   |
+| 0.4648        | 1.3043 | 120  | 0.4387          | 0.7745   |
+| 0.5117        | 1.4130 | 130  | 0.4306          | 0.7712   |
+| 0.4219        | 1.5217 | 140  | 0.4239          | 0.7680   |
+| 0.3828        | 1.6304 | 150  | 0.4314          | 0.7680   |
+| 0.3789        | 1.7391 | 160  | 0.4319          | 0.7647   |
+| 0.3828        | 1.8478 | 170  | 0.4291          | 0.7680   |
+| 0.3438        | 1.9565 | 180  | 0.4279          | 0.7745   |
+| 0.3496        | 2.0652 | 190  | 0.4274          | 0.7712   |
+| 0.3945        | 2.1739 | 200  | 0.4238          | 0.7778   |
 ### Framework versions

adapter_config.json CHANGED Viewed

@@ -25,13 +25,13 @@
   "rank_pattern": {},
   "revision": null,
   "target_modules": [
-    "gate_proj",
-    "k_proj",
     "o_proj",
-    "down_proj",
-    "v_proj",
     "q_proj",
-    "up_proj"
   ],
   "task_type": "SEQ_CLS",
   "use_dora": false,

   "rank_pattern": {},
   "revision": null,
   "target_modules": [
     "o_proj",
+    "k_proj",
     "q_proj",
+    "up_proj",
+    "gate_proj",
+    "v_proj",
+    "down_proj"
   ],
   "task_type": "SEQ_CLS",
   "use_dora": false,

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:1b05d6ee9218a751f915d98379e369ceafb80bd3cd5541fa88499982c9bc311a
 size 39260648

 version https://git-lfs.github.com/spec/v1
+oid sha256:1830e519b9fb4dfa4e4c42a3a24e90246f11070dd81e5207fe8bb2b6b993a858
 size 39260648

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:32843623ebe38f8ab6155e3692cf643a1e11e57840ee40996c7cd8a239283642
 size 5240

 version https://git-lfs.github.com/spec/v1
+oid sha256:9d5fe4c47db0ef08955bd7d9219f1ad59a76a9ba1728b459dff4c3c272321f78
 size 5240