Kreses
/

output

Kreses commited on Mar 18

Commit

a7adc73

•

1 Parent(s): d263f05

End of training

Files changed (3) hide show

README.md CHANGED Viewed

@@ -4,7 +4,7 @@ tags:
 - trl
 - sft
 - generated_from_trainer
-base_model: meta-llama/Llama-2-7b-chat-hf
 model-index:
 - name: output
   results: []
@@ -15,9 +15,9 @@ should probably proofread and complete it, then remove this comment. -->
 # output
-This model is a fine-tuned version of [meta-llama/Llama-2-7b-chat-hf](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 1.5112
 ## Model description
@@ -51,18 +51,18 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| 1.9596        | 0.29  | 1    | 1.7030          |
-| 2.0029        | 0.57  | 2    | 1.6746          |
-| 1.9649        | 0.86  | 3    | 1.6343          |
-| 1.8472        | 1.14  | 4    | 1.6023          |
-| 1.8243        | 1.43  | 5    | 1.5890          |
-| 1.8297        | 1.71  | 6    | 1.5809          |
-| 1.8483        | 2.0   | 7    | 1.5683          |
-| 1.7739        | 2.29  | 8    | 1.5528          |
-| 1.8205        | 2.57  | 9    | 1.5378          |
-| 1.7415        | 2.86  | 10   | 1.5262          |
-| 1.6532        | 3.14  | 11   | 1.5178          |
-| 1.7671        | 3.43  | 12   | 1.5112          |
 ### Framework versions

 - trl
 - sft
 - generated_from_trainer
+base_model: meta-llama/Llama-2-70b-chat-hf
 model-index:
 - name: output
   results: []
 # output
+This model is a fine-tuned version of [meta-llama/Llama-2-70b-chat-hf](https://huggingface.co/meta-llama/Llama-2-70b-chat-hf) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 1.3387
 ## Model description
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| 1.7334        | 0.29  | 1    | 1.5084          |
+| 1.7705        | 0.57  | 2    | 1.4977          |
+| 1.7433        | 0.86  | 3    | 1.4736          |
+| 1.6862        | 1.14  | 4    | 1.4434          |
+| 1.6562        | 1.43  | 5    | 1.4161          |
+| 1.615         | 1.71  | 6    | 1.3948          |
+| 1.6227        | 2.0   | 7    | 1.3813          |
+| 1.5609        | 2.29  | 8    | 1.3706          |
+| 1.619         | 2.57  | 9    | 1.3603          |
+| 1.5298        | 2.86  | 10   | 1.3511          |
+| 1.4428        | 3.14  | 11   | 1.3437          |
+| 1.5641        | 3.43  | 12   | 1.3387          |
 ### Framework versions

adapter_config.json CHANGED Viewed

@@ -1,7 +1,7 @@
 {
   "alpha_pattern": {},
   "auto_mapping": null,
-  "base_model_name_or_path": "meta-llama/Llama-2-7b-chat-hf",
   "bias": "none",
   "fan_in_fan_out": false,
   "inference_mode": true,
@@ -19,10 +19,10 @@
   "rank_pattern": {},
   "revision": null,
   "target_modules": [
-    "v_proj",
-    "q_proj",
     "k_proj",
-    "o_proj"
   ],
   "task_type": "CAUSAL_LM",
   "use_rslora": false

 {
   "alpha_pattern": {},
   "auto_mapping": null,
+  "base_model_name_or_path": "meta-llama/Llama-2-70b-chat-hf",
   "bias": "none",
   "fan_in_fan_out": false,
   "inference_mode": true,
   "rank_pattern": {},
   "revision": null,
   "target_modules": [
     "k_proj",
+    "q_proj",
+    "o_proj",
+    "v_proj"
   ],
   "task_type": "CAUSAL_LM",
   "use_rslora": false

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:ae952f03c8703076c365c1f560e6bc44b26bdaeaff581851ad2b643316bb9b06
-size 67143296

 version https://git-lfs.github.com/spec/v1
+oid sha256:130835e790a62f150122503aa714b42ef32ec051654681b66876de6fa52a1a9d
+size 262231096