End of training

Browse files

Files changed (5) hide show

README.md +87 -0
generation_config.json +7 -0
model.safetensors +1 -1
runs/Jul20_00-57-49_8e1f1d7bf2e7/events.out.tfevents.1721437083.8e1f1d7bf2e7.8132.0 +2 -2
runs/Jul20_00-57-49_8e1f1d7bf2e7/events.out.tfevents.1721442970.8e1f1d7bf2e7.8132.1 +3 -0

README.md ADDED Viewed

	@@ -0,0 +1,87 @@

+---
+license: mit
+base_model: microsoft/phi-1_5
+tags:
+- trl
+- sft
+- generated_from_trainer
+datasets:
+- generator
+model-index:
+- name: phi-1_5-sft-openhermes-v2
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# phi-1_5-sft-openhermes-v2
+This model is a fine-tuned version of [microsoft/phi-1_5](https://huggingface.co/microsoft/phi-1_5) on the generator dataset.
+It achieves the following results on the evaluation set:
+- Loss: 1.1750
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 8e-05
+- train_batch_size: 8
+- eval_batch_size: 8
+- seed: 42
+- gradient_accumulation_steps: 2
+- total_train_batch_size: 16
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: cosine
+- lr_scheduler_warmup_steps: 500
+- num_epochs: 2
+### Training results
+| Training Loss | Epoch  | Step | Validation Loss |
+|:-------------:|:------:|:----:|:---------------:|
+| 1.7865        | 0.0831 | 275  | 1.4033          |
+| 1.3614        | 0.1663 | 550  | 1.3218          |
+| 1.2986        | 0.2494 | 825  | 1.2788          |
+| 1.2667        | 0.3325 | 1100 | 1.2531          |
+| 1.2405        | 0.4157 | 1375 | 1.2376          |
+| 1.2239        | 0.4988 | 1650 | 1.2237          |
+| 1.2078        | 0.5819 | 1925 | 1.2122          |
+| 1.2114        | 0.6651 | 2200 | 1.2005          |
+| 1.2028        | 0.7482 | 2475 | 1.1915          |
+| 1.173         | 0.8313 | 2750 | 1.1833          |
+| 1.1782        | 0.9144 | 3025 | 1.1776          |
+| 1.1805        | 0.9976 | 3300 | 1.1720          |
+| 1.0112        | 1.0807 | 3575 | 1.1817          |
+| 0.9988        | 1.1638 | 3850 | 1.1791          |
+| 0.9919        | 1.2470 | 4125 | 1.1786          |
+| 0.9886        | 1.3301 | 4400 | 1.1768          |
+| 0.9904        | 1.4132 | 4675 | 1.1763          |
+| 1.001         | 1.4964 | 4950 | 1.1756          |
+| 0.9979        | 1.5795 | 5225 | 1.1751          |
+| 0.9858        | 1.6626 | 5500 | 1.1750          |
+| 0.9975        | 1.7458 | 5775 | 1.1750          |
+| 0.9924        | 1.8289 | 6050 | 1.1750          |
+| 0.9978        | 1.9120 | 6325 | 1.1750          |
+| 0.9892        | 1.9952 | 6600 | 1.1750          |
+### Framework versions
+- Transformers 4.42.4
+- Pytorch 2.3.1+cu121
+- Datasets 2.20.0
+- Tokenizers 0.19.1

generation_config.json ADDED Viewed

	@@ -0,0 +1,7 @@

+{
+  "_from_model_config": true,
+  "bos_token_id": 50295,
+  "eos_token_id": 50296,
+  "pad_token_id": 50296,
+  "transformers_version": "4.42.4"
+}

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:5194afaac03c28d0196c035a40230450264a09f48986bce84a6b84b303f7885d
 size 2829179858

 version https://git-lfs.github.com/spec/v1
+oid sha256:fbe4af698fcfe0d16cf8233b02141c736e02d139ed58f04f24fdcce272471a1c
 size 2829179858

runs/Jul20_00-57-49_8e1f1d7bf2e7/events.out.tfevents.1721437083.8e1f1d7bf2e7.8132.0 CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:1e0ddbfcf0a515fbc3fb82190408f973b64740d1c7243c6334aaf6d6657a979f
-size 17041

 version https://git-lfs.github.com/spec/v1
+oid sha256:07c9b7840d0686385de5c44036ef86afcfaeff9323f4f940f49d437b775830ad
+size 17395

runs/Jul20_00-57-49_8e1f1d7bf2e7/events.out.tfevents.1721442970.8e1f1d7bf2e7.8132.1 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:2c7fb8612dd49ac88b44d2794c2a4ce6c83a6024491aa1fe2c9bc1bd7792f170
+size 359