mikhail-panzo
/

zlm-fil_b64_le5_s8000

@@ -1,6 +1,5 @@
 ---
-license: mit
-base_model: mikhail-panzo/zlm_b64_le4_s12000
 tags:
 - generated_from_trainer
 model-index:
@@ -13,9 +12,9 @@ should probably proofread and complete it, then remove this comment. -->
 # zlm-fil_b64_le5_s8000
-This model is a fine-tuned version of [mikhail-panzo/zlm_b64_le4_s12000](https://huggingface.co/mikhail-panzo/zlm_b64_le4_s12000) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.4118
 ## Model description
@@ -50,22 +49,22 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch    | Step | Validation Loss |
 |:-------------:|:--------:|:----:|:---------------:|
-| 0.5529        | 22.2222  | 500  | 0.5000          |
-| 0.4974        | 44.4444  | 1000 | 0.4557          |
-| 0.4716        | 66.6667  | 1500 | 0.4359          |
-| 0.453         | 88.8889  | 2000 | 0.4246          |
-| 0.4428        | 111.1111 | 2500 | 0.4196          |
-| 0.4332        | 133.3333 | 3000 | 0.4171          |
-| 0.4246        | 155.5556 | 3500 | 0.4154          |
-| 0.4202        | 177.7778 | 4000 | 0.4133          |
-| 0.4223        | 200.0    | 4500 | 0.4145          |
-| 0.4127        | 222.2222 | 5000 | 0.4118          |
-| 0.418         | 244.4444 | 5500 | 0.4130          |
-| 0.4137        | 266.6667 | 6000 | 0.4130          |
-| 0.4105        | 288.8889 | 6500 | 0.4127          |
-| 0.4164        | 311.1111 | 7000 | 0.4127          |
-| 0.4088        | 333.3333 | 7500 | 0.4120          |
-| 0.4028        | 355.5556 | 8000 | 0.4118          |
 ### Framework versions

 ---
+base_model: mikhail-panzo/zlm_b128_le4_s12000
 tags:
 - generated_from_trainer
 model-index:
 # zlm-fil_b64_le5_s8000
+This model is a fine-tuned version of [mikhail-panzo/zlm_b128_le4_s12000](https://huggingface.co/mikhail-panzo/zlm_b128_le4_s12000) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.4077
 ## Model description
 | Training Loss | Epoch    | Step | Validation Loss |
 |:-------------:|:--------:|:----:|:---------------:|
+| 0.5541        | 21.7391  | 500  | 0.4977          |
+| 0.4931        | 43.4783  | 1000 | 0.4529          |
+| 0.4695        | 65.2174  | 1500 | 0.4330          |
+| 0.4518        | 86.9565  | 2000 | 0.4230          |
+| 0.4442        | 108.6957 | 2500 | 0.4179          |
+| 0.4344        | 130.4348 | 3000 | 0.4135          |
+| 0.4318        | 152.1739 | 3500 | 0.4111          |
+| 0.4201        | 173.9130 | 4000 | 0.4110          |
+| 0.4185        | 195.6522 | 4500 | 0.4091          |
+| 0.4153        | 217.3913 | 5000 | 0.4097          |
+| 0.414         | 239.1304 | 5500 | 0.4069          |
+| 0.4113        | 260.8696 | 6000 | 0.4080          |
+| 0.4133        | 282.6087 | 6500 | 0.4073          |
+| 0.4095        | 304.3478 | 7000 | 0.4059          |
+| 0.4129        | 326.0870 | 7500 | 0.4083          |
+| 0.4035        | 347.8261 | 8000 | 0.4077          |
 ### Framework versions

generation_config.json CHANGED Viewed

@@ -5,5 +5,6 @@
   "eos_token_id": 2,
   "max_length": 1876,
   "pad_token_id": 1,
-  "transformers_version": "4.41.0.dev0"
 }

   "eos_token_id": 2,
   "max_length": 1876,
   "pad_token_id": 1,
+  "transformers_version": "4.41.0.dev0",
+  "use_cache": false
 }

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:e4f5b7eb731d80ba7fa9e08897112adb60df70410cb7a191fbe80df3ce92be45
 size 577789320

 version https://git-lfs.github.com/spec/v1
+oid sha256:1893d174fc5f55f120c5d3ea3b7c7f2c4d2a38a52bfce7cc13d3e84b6a692d7f
 size 577789320