Model save

Files changed (5) hide show

README.md CHANGED Viewed

@@ -19,7 +19,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [imone/Mistral_7B_with_EOT_token](https://huggingface.co/imone/Mistral_7B_with_EOT_token) on the generator dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.9067
 ## Model description
@@ -38,14 +38,14 @@ More information needed
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 2e-05
-- train_batch_size: 4
-- eval_batch_size: 8
 - seed: 42
 - distributed_type: multi-GPU
 - num_devices: 8
-- total_train_batch_size: 32
-- total_eval_batch_size: 64
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_ratio: 0.1
@@ -53,13 +53,13 @@ The following hyperparameters were used during training:
 ### Training results
-| Training Loss | Epoch | Step  | Validation Loss |
-|:-------------:|:-----:|:-----:|:---------------:|
-| 0.0936        | 1.0   | 2315  | 0.9145          |
-| 0.7793        | 2.0   | 4630  | 5.4135          |
-| 0.3835        | 3.0   | 6945  | 3.1220          |
-| 0.0959        | 4.0   | 9260  | 1.1934          |
-| 0.0725        | 5.0   | 11575 | 0.9067          |
 ### Framework versions

 This model is a fine-tuned version of [imone/Mistral_7B_with_EOT_token](https://huggingface.co/imone/Mistral_7B_with_EOT_token) on the generator dataset.
 It achieves the following results on the evaluation set:
+- Loss: 1.5409
 ## Model description
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- learning_rate: 5e-06
+- train_batch_size: 8
+- eval_batch_size: 4
 - seed: 42
 - distributed_type: multi-GPU
 - num_devices: 8
+- total_train_batch_size: 64
+- total_eval_batch_size: 32
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_ratio: 0.1
 ### Training results
+| Training Loss | Epoch | Step | Validation Loss |
+|:-------------:|:-----:|:----:|:---------------:|
+| 0.6586        | 1.0   | 1126 | 1.2377          |
+| 0.4945        | 2.0   | 2252 | 1.2554          |
+| 0.3216        | 3.0   | 3378 | 1.3214          |
+| 0.1805        | 4.0   | 4504 | 1.4283          |
+| 0.1056        | 5.0   | 5630 | 1.5409          |
 ### Framework versions

model-00001-of-00003.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:40a7faad9a9857f9eaf97bb695b2ce6a6795902aee573bf142e57c3c236a4f6a
 size 4943178720

 version https://git-lfs.github.com/spec/v1
+oid sha256:18aaa8bfa190c2f8f532bc1015f35328c98d900f2782749e49e640fc9e5d11d0
 size 4943178720

model-00002-of-00003.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:8b7aa038fc1b07012b73042a53ca58f1c1109a00509a0e7534f51a63047d481d
 size 4999819336

 version https://git-lfs.github.com/spec/v1
+oid sha256:adbbbd8d485b5144d29b892e6362eb90d468c8df0de888de0956cec0bc7354bb
 size 4999819336

model-00003-of-00003.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:c9bb26ed2992db60cf261faacd1530a1cb1c0d33e6a60764cf6372ec0fe57ec6
 size 4540532728

 version https://git-lfs.github.com/spec/v1
+oid sha256:8ae25a70b4d6f2c53d6425b5cd3eaf8d6a4b8d4f4747fef6cb873f5c99f1ef20
 size 4540532728

runs/May22_12-17-57_n136-129-074/events.out.tfevents.1716352017.n136-129-074.2581668.0 CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:6f8c66e666703191e0da54c2304821cb98cb0ca04050496a4278aca2e1729595
-size 238061

 version https://git-lfs.github.com/spec/v1
+oid sha256:5377b87767c04159caf85c2c4ff1e30c5356d24b5304e53ae683146d9b5f00a8
+size 244172