Model save

Browse files

Files changed (4) hide show

README.md +12 -22
adapter_model.safetensors +1 -1
all_results.json +6 -6
train_results.json +6 -6

README.md CHANGED Viewed

@@ -18,7 +18,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.2908
 ## Model description
@@ -46,32 +46,22 @@ The following hyperparameters were used during training:
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 100
-- training_steps: 2048
 ### Training results
 | Training Loss | Epoch  | Step | Validation Loss |
 |:-------------:|:------:|:----:|:---------------:|
-| 0.0815        | 0.1892 | 100  | 0.2948          |
-| 0.0434        | 0.3784 | 200  | 0.2018          |
-| 0.0456        | 0.5676 | 300  | 0.2325          |
-| 0.0303        | 0.7569 | 400  | 0.1661          |
-| 0.0743        | 0.9461 | 500  | 0.1364          |
-| 0.0324        | 1.1353 | 600  | 0.1452          |
-| 0.0255        | 1.3245 | 700  | 0.2203          |
-| 0.0372        | 1.5137 | 800  | 0.2048          |
-| 0.0236        | 1.7029 | 900  | 0.2011          |
-| 0.002         | 1.8921 | 1000 | 0.2422          |
-| 0.0107        | 2.0814 | 1100 | 0.2662          |
-| 0.0099        | 2.2706 | 1200 | 0.2508          |
-| 0.0199        | 2.4598 | 1300 | 0.3019          |
-| 0.005         | 2.6490 | 1400 | 0.2671          |
-| 0.0297        | 2.8382 | 1500 | 0.2541          |
-| 0.0011        | 3.0274 | 1600 | 0.2923          |
-| 0.0152        | 3.2167 | 1700 | 0.2680          |
-| 0.0007        | 3.4059 | 1800 | 0.2882          |
-| 0.0098        | 3.5951 | 1900 | 0.2746          |
-| 0.0251        | 3.7843 | 2000 | 0.2908          |
 ### Framework versions

 This model is a fine-tuned version of [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.0833
 ## Model description
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 100
+- training_steps: 1024
 ### Training results
 | Training Loss | Epoch  | Step | Validation Loss |
 |:-------------:|:------:|:----:|:---------------:|
+| 0.166         | 0.1110 | 100  | 0.1586          |
+| 0.0995        | 0.2220 | 200  | 0.1237          |
+| 0.1139        | 0.3330 | 300  | 0.0977          |
+| 0.0505        | 0.4440 | 400  | 0.0790          |
+| 0.029         | 0.5550 | 500  | 0.1129          |
+| 0.047         | 0.6660 | 600  | 0.0783          |
+| 0.052         | 0.7770 | 700  | 0.0841          |
+| 0.0416        | 0.8880 | 800  | 0.0718          |
+| 0.0349        | 0.9990 | 900  | 0.0810          |
+| 0.0618        | 1.1100 | 1000 | 0.0833          |
 ### Framework versions

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:28b91b282269524bb58185924d517bc839967735869215cb8a8ef5222b280ebc
 size 2115012328

 version https://git-lfs.github.com/spec/v1
+oid sha256:7efc50c3c5a1c66159bbcd087a587e7d69205edefe756fdaec361309f1b1a4b4
 size 2115012328

all_results.json CHANGED Viewed

@@ -1,8 +1,8 @@
 {
-    "epoch": 3.8751182592242195,
-    "total_flos": 1.809303241400451e+18,
-    "train_loss": 0.11459405875015705,
-    "train_runtime": 18906.0633,
-    "train_samples_per_second": 6.933,
-    "train_steps_per_second": 0.108
 }

 {
+    "epoch": 1.136672679339531,
+    "total_flos": 7.835717038709146e+17,
+    "train_loss": 0.22367067684899666,
+    "train_runtime": 8651.0001,
+    "train_samples_per_second": 7.576,
+    "train_steps_per_second": 0.118
 }

train_results.json CHANGED Viewed

@@ -1,8 +1,8 @@
 {
-    "epoch": 3.8751182592242195,
-    "total_flos": 1.809303241400451e+18,
-    "train_loss": 0.11459405875015705,
-    "train_runtime": 18906.0633,
-    "train_samples_per_second": 6.933,
-    "train_steps_per_second": 0.108
 }

 {
+    "epoch": 1.136672679339531,
+    "total_flos": 7.835717038709146e+17,
+    "train_loss": 0.22367067684899666,
+    "train_runtime": 8651.0001,
+    "train_samples_per_second": 7.576,
+    "train_steps_per_second": 0.118
 }