sv469/phi-2_finetuned

Files changed (4) hide show

README.md CHANGED Viewed

@@ -16,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [microsoft/phi-2](https://huggingface.co/microsoft/phi-2) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.3177
 ## Model description
@@ -44,28 +44,24 @@ The following hyperparameters were used during training:
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 2
-- training_steps: 500
 ### Training results
 | Training Loss | Epoch   | Step | Validation Loss |
 |:-------------:|:-------:|:----:|:---------------:|
-| 1.2127        | 2.5641  | 50   | 0.5753          |
-| 0.5175        | 5.1282  | 100  | 0.4192          |
-| 0.3604        | 7.6923  | 150  | 0.3369          |
-| 0.2555        | 10.2564 | 200  | 0.3104          |
-| 0.1953        | 12.8205 | 250  | 0.3153          |
-| 0.1453        | 15.3846 | 300  | 0.3106          |
-| 0.1237        | 17.9487 | 350  | 0.3059          |
-| 0.0965        | 20.5128 | 400  | 0.3080          |
-| 0.0842        | 23.0769 | 450  | 0.3207          |
-| 0.0751        | 25.6410 | 500  | 0.3177          |
 ### Framework versions
 - PEFT 0.11.1
-- Transformers 4.42.3
 - Pytorch 2.1.2
 - Datasets 2.20.0
 - Tokenizers 0.19.1

 This model is a fine-tuned version of [microsoft/phi-2](https://huggingface.co/microsoft/phi-2) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.1627
 ## Model description
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 2
+- training_steps: 300
 ### Training results
 | Training Loss | Epoch   | Step | Validation Loss |
 |:-------------:|:-------:|:----:|:---------------:|
+| 0.92          | 4.6512  | 50   | 0.3772          |
+| 0.2534        | 9.3023  | 100  | 0.2253          |
+| 0.1327        | 13.9535 | 150  | 0.1858          |
+| 0.0795        | 18.6047 | 200  | 0.1676          |
+| 0.0601        | 23.2558 | 250  | 0.1641          |
+| 0.0483        | 27.9070 | 300  | 0.1627          |
 ### Framework versions
 - PEFT 0.11.1
+- Transformers 4.42.4
 - Pytorch 2.1.2
 - Datasets 2.20.0
 - Tokenizers 0.19.1

adapter_config.json CHANGED Viewed

@@ -20,10 +20,10 @@
   "rank_pattern": {},
   "revision": null,
   "target_modules": [
-    "v_proj",
     "q_proj",
-    "dense",
-    "k_proj"
   ],
   "task_type": "CAUSAL_LM",
   "use_dora": false,

   "rank_pattern": {},
   "revision": null,
   "target_modules": [
     "q_proj",
+    "k_proj",
+    "v_proj",
+    "dense"
   ],
   "task_type": "CAUSAL_LM",
   "use_dora": false,

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:a34793be02fbd4f2282497a6888db4174dbfeaa36ed557e84eafb2b155ac02cf
 size 83920464

 version https://git-lfs.github.com/spec/v1
+oid sha256:87917607e271e584fc3b86415fdfd0cab6ace7f7d2be189e9eb9f91ddf1967f0
 size 83920464

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:7c464dbc10d45536d440e8620e6d50fe8f5f796add6364694cef349c0e3c281d
 size 5112

 version https://git-lfs.github.com/spec/v1
+oid sha256:f99b3219f11780e86b2f5401e3c4473722529a744a068013da9406cb4ef7d065
 size 5112