Training in progress, step 50

Browse files

Files changed (7) hide show

README.md +32 -24
adapter_config.json +2 -2
adapter_model.safetensors +1 -1
model-00001-of-00003.safetensors +1 -1
model-00002-of-00003.safetensors +1 -1
model-00003-of-00003.safetensors +1 -1
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -19,8 +19,8 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [](https://huggingface.co/) on the super_glue dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.3573
-- Accuracy: 0.8580
 ## Model description
@@ -42,7 +42,7 @@ The following hyperparameters were used during training:
 - learning_rate: 1e-05
 - train_batch_size: 2
 - eval_batch_size: 4
-- seed: 0
 - distributed_type: multi-GPU
 - num_devices: 2
 - gradient_accumulation_steps: 2
@@ -56,27 +56,35 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch | Step | Validation Loss | Accuracy |
 |:-------------:|:-----:|:----:|:---------------:|:--------:|
-| 0.6397        | 0.05  | 50   | 0.4990          | 0.7802   |
-| 0.4151        | 0.1   | 100  | 0.4801          | 0.7972   |
-| 0.1805        | 0.15  | 150  | 0.4870          | 0.7965   |
-| 0.4189        | 0.2   | 200  | 0.4392          | 0.8240   |
-| 0.2601        | 0.25  | 250  | 0.4282          | 0.8311   |
-| 0.3197        | 0.3   | 300  | 0.4401          | 0.8375   |
-| 0.1839        | 0.35  | 350  | 0.4113          | 0.8382   |
-| 0.4575        | 0.4   | 400  | 0.3906          | 0.8417   |
-| 0.5272        | 0.45  | 450  | 0.3932          | 0.8466   |
-| 0.2914        | 0.5   | 500  | 0.4045          | 0.8445   |
-| 0.4161        | 0.55  | 550  | 0.4602          | 0.8382   |
-| 0.245         | 0.6   | 600  | 0.3853          | 0.8495   |
-| 0.4237        | 0.65  | 650  | 0.3665          | 0.8537   |
-| 0.1947        | 0.7   | 700  | 0.3632          | 0.8601   |
-| 0.4434        | 0.75  | 750  | 0.3511          | 0.8601   |
-| 0.4321        | 0.8   | 800  | 0.3905          | 0.8615   |
-| 0.2262        | 0.85  | 850  | 0.3542          | 0.8587   |
-| 0.41          | 0.9   | 900  | 0.3773          | 0.8572   |
-| 0.585         | 0.95  | 950  | 0.3701          | 0.8572   |
-| 0.3416        | 1.0   | 1000 | 0.3616          | 0.8608   |
-| 0.9198        | 1.05  | 1050 | 0.3589          | 0.8580   |
 ### Framework versions

 This model is a fine-tuned version of [](https://huggingface.co/) on the super_glue dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.3412
+- Accuracy: 0.8678
 ## Model description
 - learning_rate: 1e-05
 - train_batch_size: 2
 - eval_batch_size: 4
+- seed: 1
 - distributed_type: multi-GPU
 - num_devices: 2
 - gradient_accumulation_steps: 2
 | Training Loss | Epoch | Step | Validation Loss | Accuracy |
 |:-------------:|:-----:|:----:|:---------------:|:--------:|
+| 0.4585        | 0.05  | 50   | 0.4872          | 0.7760   |
+| 0.3385        | 0.1   | 100  | 0.4658          | 0.7880   |
+| 0.8423        | 0.15  | 150  | 0.4830          | 0.8113   |
+| 0.4092        | 0.2   | 200  | 0.4115          | 0.8269   |
+| 0.4483        | 0.25  | 250  | 0.4318          | 0.8120   |
+| 0.3319        | 0.3   | 300  | 0.4008          | 0.8346   |
+| 0.3072        | 0.35  | 350  | 0.4538          | 0.8346   |
+| 0.2787        | 0.4   | 400  | 0.4432          | 0.8332   |
+| 0.2965        | 0.45  | 450  | 0.3921          | 0.8403   |
+| 0.2684        | 0.5   | 500  | 0.3777          | 0.8445   |
+| 0.2543        | 0.55  | 550  | 0.3793          | 0.8481   |
+| 0.196         | 0.6   | 600  | 0.3703          | 0.8509   |
+| 0.2828        | 0.65  | 650  | 0.3910          | 0.8509   |
+| 0.2073        | 0.7   | 700  | 0.3813          | 0.8587   |
+| 0.2733        | 0.75  | 750  | 0.3841          | 0.8650   |
+| 0.3351        | 0.8   | 800  | 0.3658          | 0.8643   |
+| 0.126         | 0.85  | 850  | 0.3713          | 0.8643   |
+| 0.3351        | 0.9   | 900  | 0.3457          | 0.8594   |
+| 0.3614        | 0.95  | 950  | 0.3645          | 0.8629   |
+| 0.3383        | 1.0   | 1000 | 0.3809          | 0.8615   |
+| 0.2019        | 1.05  | 1050 | 0.4630          | 0.8707   |
+| 0.5042        | 1.1   | 1100 | 0.3724          | 0.8657   |
+| 0.2159        | 1.15  | 1150 | 0.3449          | 0.8643   |
+| 0.4469        | 1.2   | 1200 | 0.3555          | 0.8693   |
+| 0.6519        | 1.25  | 1250 | 0.4500          | 0.8686   |
+| 0.0631        | 1.3   | 1300 | 0.4127          | 0.8678   |
+| 0.2844        | 1.35  | 1350 | 0.3950          | 0.8721   |
+| 0.3926        | 1.4   | 1400 | 0.3742          | 0.8714   |
+| 0.0955        | 1.45  | 1450 | 0.3810          | 0.8707   |
 ### Framework versions

adapter_config.json CHANGED Viewed

@@ -19,10 +19,10 @@
   "rank_pattern": {},
   "revision": null,
   "target_modules": [
-    "v_proj",
     "q_proj",
     "gate_proj",
-    "down_proj"
   ],
   "task_type": "CAUSAL_LM"
 }

   "rank_pattern": {},
   "revision": null,
   "target_modules": [
     "q_proj",
     "gate_proj",
+    "down_proj",
+    "v_proj"
   ],
   "task_type": "CAUSAL_LM"
 }

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:1449e751b1ac8d88a59a626967d40884b6ca2f12a5f83ade084bbbe51093d66f
 size 205555592

 version https://git-lfs.github.com/spec/v1
+oid sha256:46e79effb4eb07ba65dc4dd45dd352461632d89ecb1d76915ce809c8f5ddaff3
 size 205555592

model-00001-of-00003.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:6bff80c30292073d5da33263b8373c95248e2e6bceec40f4c94a791df5e2a6ca
 size 4943163992

 version https://git-lfs.github.com/spec/v1
+oid sha256:5cb4f6ee8818d7f3a150dd0ed9657780fddf17bc442f642063aa6515bfefefc0
 size 4943163992

model-00002-of-00003.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:4c28d3e73304f62c2444300b2d5c4c6935fdf0de0375d13c23370038735eac7c
 size 4999821144

 version https://git-lfs.github.com/spec/v1
+oid sha256:03cd1ce2b0f84e5635ce6b03704c75b34f8cc2a11eb1e546f95428fef02937f0
 size 4999821144

model-00003-of-00003.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:6334877f77cea916d690b02a81cc8e1055e711ab307818eacfb11dc9a6298627
 size 4540517840

 version https://git-lfs.github.com/spec/v1
+oid sha256:e3ab5f6637f696df3f15f4ba11068bfebad8eac84f0e7a4d2bde70a795aa5c0e
 size 4540517840

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:2f3805618f1e860e17074772a001b83dfaa4c6ebfa170013656e3713f1e01a0b
 size 6008

 version https://git-lfs.github.com/spec/v1
+oid sha256:b16e4e572a5397d75c40edc507c121be0234f09479c08970ce847a154b86ed82
 size 6008