Model save

Files changed (5) hide show

README.md CHANGED Viewed

@@ -2,16 +2,12 @@
 license: apache-2.0
 library_name: peft
 tags:
-- alignment-handbook
-- trl
-- sft
-- generated_from_trainer
 - trl
 - sft
 - alignment-handbook
 - generated_from_trainer
 datasets:
-- nthakur/GSM8KInstruct-Parallel-instruct
 base_model: mistralai/Mistral-7B-v0.1
 model-index:
 - name: mistral-7b-v0.1-sft-mix-21st-mar-v0
@@ -23,9 +19,9 @@ should probably proofread and complete it, then remove this comment. -->
 # mistral-7b-v0.1-sft-mix-21st-mar-v0
-This model is a fine-tuned version of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) on the nthakur/GSM8KInstruct-Parallel-instruct dataset.
 It achieves the following results on the evaluation set:
-- Loss: 1.0770
 ## Model description
@@ -44,7 +40,7 @@ More information needed
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 3e-06
 - train_batch_size: 16
 - eval_batch_size: 16
 - seed: 42
@@ -61,7 +57,7 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| 1.0726        | 1.0   | 4    | 1.0770          |
 ### Framework versions

 license: apache-2.0
 library_name: peft
 tags:
 - trl
 - sft
 - alignment-handbook
 - generated_from_trainer
 datasets:
+- generator
 base_model: mistralai/Mistral-7B-v0.1
 model-index:
 - name: mistral-7b-v0.1-sft-mix-21st-mar-v0
 # mistral-7b-v0.1-sft-mix-21st-mar-v0
+This model is a fine-tuned version of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) on the generator dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.9775
 ## Model description
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- learning_rate: 1e-06
 - train_batch_size: 16
 - eval_batch_size: 16
 - seed: 42
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| 0.9321        | 1.0   | 7754 | 0.9775          |
 ### Framework versions

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:5bd70cf34e4612374acfe6b951e4e06aafbbcd297687532b77498bac7ddb98a6
 size 335605144

 version https://git-lfs.github.com/spec/v1
+oid sha256:0b726de479b0f5832c32f01e6dc3b4a0cfec2542370816b1724f35e10965905d
 size 335605144

all_results.json CHANGED Viewed

@@ -5,9 +5,9 @@
     "eval_samples": 3683,
     "eval_samples_per_second": 4.299,
     "eval_steps_per_second": 0.14,
-    "train_loss": 1.0347997546195984,
-    "train_runtime": 278.4186,
-    "train_samples": 698,
-    "train_samples_per_second": 0.431,
-    "train_steps_per_second": 0.014
 }

     "eval_samples": 3683,
     "eval_samples_per_second": 4.299,
     "eval_steps_per_second": 0.14,
+    "train_loss": 0.9098342092251821,
+    "train_runtime": 184655.0174,
+    "train_samples": 915466,
+    "train_samples_per_second": 1.344,
+    "train_steps_per_second": 0.042
 }

train_results.json CHANGED Viewed

@@ -1,8 +1,8 @@
 {
     "epoch": 1.0,
-    "train_loss": 1.0347997546195984,
-    "train_runtime": 278.4186,
-    "train_samples": 698,
-    "train_samples_per_second": 0.431,
-    "train_steps_per_second": 0.014
 }

 {
     "epoch": 1.0,
+    "train_loss": 0.9098342092251821,
+    "train_runtime": 184655.0174,
+    "train_samples": 915466,
+    "train_samples_per_second": 1.344,
+    "train_steps_per_second": 0.042
 }

trainer_state.json CHANGED Viewed

The diff for this file is too large to render. See raw diff