--- license: other base_model: typeof/phi-1_5 tags: - generated_from_trainer model-index: - name: phi-kelm-out results: [] --- [Built with Axolotl](https://github.com/OpenAccess-AI-Collective/axolotl) # phi-kelm-out This model is a fine-tuned version of [typeof/phi-1_5](https://huggingface.co/typeof/phi-1_5) on the Kelm Tiny dataset. It achieves the following results on the evaluation set: - Loss: 1.0079 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 3e-06 - train_batch_size: 1 - eval_batch_size: 1 - seed: 42 - optimizer: Adam with betas=(0.9,0.95) and epsilon=1e-05 - lr_scheduler_type: cosine - lr_scheduler_warmup_steps: 1000 - num_epochs: 5 ### Training results | Training Loss | Epoch | Step | Validation Loss | |:-------------:|:-----:|:-----:|:---------------:| | 7.8236 | 0.0 | 1 | 5.4714 | | 4.156 | 0.1 | 995 | 4.0834 | | 1.9418 | 0.2 | 1990 | 2.8447 | | 1.8908 | 0.3 | 2985 | 2.2757 | | 0.7631 | 0.4 | 3980 | 1.8792 | | 1.0878 | 0.5 | 4975 | 1.4944 | | 2.1561 | 0.6 | 5970 | 1.3413 | | 0.452 | 0.7 | 6965 | 1.2682 | | 2.1017 | 0.8 | 7960 | 1.2247 | | 0.8352 | 0.9 | 8955 | 1.1999 | | 5.1122 | 1.0 | 9950 | 1.1778 | | 1.6136 | 1.1 | 10945 | 1.1515 | | 2.3537 | 1.2 | 11940 | 1.1364 | | 0.2987 | 1.3 | 12935 | 1.1391 | | 0.747 | 1.4 | 13930 | 1.0977 | | 0.0025 | 1.5 | 14925 | 1.0917 | | 0.6355 | 1.6 | 15920 | 1.0630 | | 0.5881 | 1.7 | 16915 | 1.0565 | | 0.3181 | 1.8 | 17910 | 1.0568 | | 0.9256 | 1.9 | 18905 | 1.0623 | | 4.5318 | 2.0 | 19900 | 1.0678 | | 0.8736 | 2.1 | 20895 | 1.0645 | | 2.2079 | 2.2 | 21890 | 1.0474 | | 2.7407 | 2.3 | 22885 | 1.0438 | | 2.2308 | 2.4 | 23880 | 1.0485 | | 0.4307 | 2.5 | 24875 | 1.0235 | | 0.2956 | 2.6 | 25870 | 1.0201 | | 0.203 | 2.7 | 26865 | 1.0200 | | 2.2452 | 2.8 | 27860 | 1.0243 | | 0.942 | 2.9 | 28855 | 1.0289 | | 0.0069 | 3.0 | 29850 | 1.0181 | | 3.2121 | 3.1 | 30845 | 1.0235 | | 1.4533 | 3.2 | 31840 | 1.0127 | | 0.208 | 3.3 | 32835 | 1.0110 | | 0.1379 | 3.4 | 33830 | 1.0126 | | 0.1991 | 3.5 | 34825 | 1.0103 | | 1.3019 | 3.6 | 35820 | 1.0154 | | 0.6602 | 3.7 | 36815 | 1.0178 | | 0.5271 | 3.8 | 37810 | 1.0087 | | 0.3131 | 3.9 | 38805 | 1.0092 | | 3.6821 | 4.0 | 39800 | 1.0094 | | 0.3724 | 4.1 | 40795 | 1.0093 | | 0.0704 | 4.2 | 41790 | 1.0081 | | 0.1209 | 4.3 | 42785 | 1.0108 | | 0.9807 | 4.4 | 43780 | 1.0072 | | 0.1392 | 4.5 | 44775 | 1.0078 | | 0.2561 | 4.6 | 45770 | 1.0078 | | 0.1533 | 4.7 | 46765 | 1.0089 | | 0.4302 | 4.8 | 47760 | 1.0079 | | 1.3744 | 4.9 | 48755 | 1.0074 | | 0.8572 | 5.0 | 49750 | 1.0079 | ### Framework versions - Transformers 4.35.1 - Pytorch 2.0.1+cu118 - Datasets 2.14.5 - Tokenizers 0.14.1