File size: 4,116 Bytes
e28b4a7 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 |
---
license: other
base_model: typeof/phi-1_5
tags:
- generated_from_trainer
model-index:
- name: phi-kelm-out
results: []
---
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->
[<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
# phi-kelm-out
This model is a fine-tuned version of [typeof/phi-1_5](https://huggingface.co/typeof/phi-1_5) on the Kelm Tiny dataset.
It achieves the following results on the evaluation set:
- Loss: 1.0079
## Model description
More information needed
## Intended uses & limitations
More information needed
## Training and evaluation data
More information needed
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 3e-06
- train_batch_size: 1
- eval_batch_size: 1
- seed: 42
- optimizer: Adam with betas=(0.9,0.95) and epsilon=1e-05
- lr_scheduler_type: cosine
- lr_scheduler_warmup_steps: 1000
- num_epochs: 5
### Training results
| Training Loss | Epoch | Step | Validation Loss |
|:-------------:|:-----:|:-----:|:---------------:|
| 7.8236 | 0.0 | 1 | 5.4714 |
| 4.156 | 0.1 | 995 | 4.0834 |
| 1.9418 | 0.2 | 1990 | 2.8447 |
| 1.8908 | 0.3 | 2985 | 2.2757 |
| 0.7631 | 0.4 | 3980 | 1.8792 |
| 1.0878 | 0.5 | 4975 | 1.4944 |
| 2.1561 | 0.6 | 5970 | 1.3413 |
| 0.452 | 0.7 | 6965 | 1.2682 |
| 2.1017 | 0.8 | 7960 | 1.2247 |
| 0.8352 | 0.9 | 8955 | 1.1999 |
| 5.1122 | 1.0 | 9950 | 1.1778 |
| 1.6136 | 1.1 | 10945 | 1.1515 |
| 2.3537 | 1.2 | 11940 | 1.1364 |
| 0.2987 | 1.3 | 12935 | 1.1391 |
| 0.747 | 1.4 | 13930 | 1.0977 |
| 0.0025 | 1.5 | 14925 | 1.0917 |
| 0.6355 | 1.6 | 15920 | 1.0630 |
| 0.5881 | 1.7 | 16915 | 1.0565 |
| 0.3181 | 1.8 | 17910 | 1.0568 |
| 0.9256 | 1.9 | 18905 | 1.0623 |
| 4.5318 | 2.0 | 19900 | 1.0678 |
| 0.8736 | 2.1 | 20895 | 1.0645 |
| 2.2079 | 2.2 | 21890 | 1.0474 |
| 2.7407 | 2.3 | 22885 | 1.0438 |
| 2.2308 | 2.4 | 23880 | 1.0485 |
| 0.4307 | 2.5 | 24875 | 1.0235 |
| 0.2956 | 2.6 | 25870 | 1.0201 |
| 0.203 | 2.7 | 26865 | 1.0200 |
| 2.2452 | 2.8 | 27860 | 1.0243 |
| 0.942 | 2.9 | 28855 | 1.0289 |
| 0.0069 | 3.0 | 29850 | 1.0181 |
| 3.2121 | 3.1 | 30845 | 1.0235 |
| 1.4533 | 3.2 | 31840 | 1.0127 |
| 0.208 | 3.3 | 32835 | 1.0110 |
| 0.1379 | 3.4 | 33830 | 1.0126 |
| 0.1991 | 3.5 | 34825 | 1.0103 |
| 1.3019 | 3.6 | 35820 | 1.0154 |
| 0.6602 | 3.7 | 36815 | 1.0178 |
| 0.5271 | 3.8 | 37810 | 1.0087 |
| 0.3131 | 3.9 | 38805 | 1.0092 |
| 3.6821 | 4.0 | 39800 | 1.0094 |
| 0.3724 | 4.1 | 40795 | 1.0093 |
| 0.0704 | 4.2 | 41790 | 1.0081 |
| 0.1209 | 4.3 | 42785 | 1.0108 |
| 0.9807 | 4.4 | 43780 | 1.0072 |
| 0.1392 | 4.5 | 44775 | 1.0078 |
| 0.2561 | 4.6 | 45770 | 1.0078 |
| 0.1533 | 4.7 | 46765 | 1.0089 |
| 0.4302 | 4.8 | 47760 | 1.0079 |
| 1.3744 | 4.9 | 48755 | 1.0074 |
| 0.8572 | 5.0 | 49750 | 1.0079 |
### Framework versions
- Transformers 4.35.1
- Pytorch 2.0.1+cu118
- Datasets 2.14.5
- Tokenizers 0.14.1
|