Model save

Files changed (5) hide show

README.md CHANGED Viewed

@@ -1,14 +1,10 @@
 ---
 license: apache-2.0
-library_name: peft
 tags:
-- alignment-handbook
 - trl
 - sft
 - generated_from_trainer
-base_model: mistralai/Mistral-7B-v0.1
-datasets:
-- HuggingFaceH4/ultrachat_200k
 model-index:
 - name: mistral20p
   results: []
@@ -19,7 +15,9 @@ should probably proofread and complete it, then remove this comment. -->
 # mistral20p
-This model is a fine-tuned version of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) on the HuggingFaceH4/ultrachat_200k dataset.
 ## Model description
@@ -54,12 +52,14 @@ The following hyperparameters were used during training:
 ### Training results
 ### Framework versions
-- PEFT 0.11.1
 - Transformers 4.41.1
 - Pytorch 2.2.2+cu121
 - Datasets 2.19.1
-- Tokenizers 0.19.1

 ---
 license: apache-2.0
+base_model: mistralai/Mistral-7B-v0.1
 tags:
 - trl
 - sft
 - generated_from_trainer
 model-index:
 - name: mistral20p
   results: []
 # mistral20p
+This model is a fine-tuned version of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) on the None dataset.
+It achieves the following results on the evaluation set:
+- Loss: nan
 ## Model description
 ### Training results
+| Training Loss | Epoch | Step | Validation Loss |
+|:-------------:|:-----:|:----:|:---------------:|
+| 0.702         | 1.0   | 406  | nan             |
 ### Framework versions
 - Transformers 4.41.1
 - Pytorch 2.2.2+cu121
 - Datasets 2.19.1
+- Tokenizers 0.19.1

all_results.json CHANGED Viewed

@@ -1,9 +1,9 @@
 {
     "epoch": 1.0,
-    "total_flos": 1.6764562374852608e+16,
-    "train_loss": 0.0,
-    "train_runtime": 0.0156,
     "train_samples": 103932,
-    "train_samples_per_second": 6662118.554,
-    "train_steps_per_second": 104099.609
 }

 {
     "epoch": 1.0,
+    "total_flos": 2.1031592360023163e+18,
+    "train_loss": 0.7368329773689138,
+    "train_runtime": 67624.003,
     "train_samples": 103932,
+    "train_samples_per_second": 1.537,
+    "train_steps_per_second": 0.006
 }

runs/May29_02-17-51_ip-172-31-69-60.ec2.internal/events.out.tfevents.1716949083.ip-172-31-69-60.ec2.internal.2097.0 CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:7476f148101391ba89c26b0967d6ba8a8cef3425f07d8f9f172ca84438216740
-size 22080

 version https://git-lfs.github.com/spec/v1
+oid sha256:4beee68ba65b3f2f4ae0aa844258acd5ce78ac04b534a5da91bd6400fc3c32c7
+size 22916

train_results.json CHANGED Viewed

@@ -1,9 +1,9 @@
 {
     "epoch": 1.0,
-    "total_flos": 1.6764562374852608e+16,
-    "train_loss": 0.0,
-    "train_runtime": 0.0156,
     "train_samples": 103932,
-    "train_samples_per_second": 6662118.554,
-    "train_steps_per_second": 104099.609
 }

 {
     "epoch": 1.0,
+    "total_flos": 2.1031592360023163e+18,
+    "train_loss": 0.7368329773689138,
+    "train_runtime": 67624.003,
     "train_samples": 103932,
+    "train_samples_per_second": 1.537,
+    "train_steps_per_second": 0.006
 }

trainer_state.json CHANGED Viewed

The diff for this file is too large to render. See raw diff