metadata
license: other
base_model: typeof/phi-1_5
tags:
- generated_from_trainer
model-index:
- name: phi-kelm-out
results: []
phi-kelm-out
This model is a fine-tuned version of typeof/phi-1_5 on the Kelm Tiny dataset. It achieves the following results on the evaluation set:
- Loss: 1.0079
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 3e-06
- train_batch_size: 1
- eval_batch_size: 1
- seed: 42
- optimizer: Adam with betas=(0.9,0.95) and epsilon=1e-05
- lr_scheduler_type: cosine
- lr_scheduler_warmup_steps: 1000
- num_epochs: 5
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
7.8236 | 0.0 | 1 | 5.4714 |
4.156 | 0.1 | 995 | 4.0834 |
1.9418 | 0.2 | 1990 | 2.8447 |
1.8908 | 0.3 | 2985 | 2.2757 |
0.7631 | 0.4 | 3980 | 1.8792 |
1.0878 | 0.5 | 4975 | 1.4944 |
2.1561 | 0.6 | 5970 | 1.3413 |
0.452 | 0.7 | 6965 | 1.2682 |
2.1017 | 0.8 | 7960 | 1.2247 |
0.8352 | 0.9 | 8955 | 1.1999 |
5.1122 | 1.0 | 9950 | 1.1778 |
1.6136 | 1.1 | 10945 | 1.1515 |
2.3537 | 1.2 | 11940 | 1.1364 |
0.2987 | 1.3 | 12935 | 1.1391 |
0.747 | 1.4 | 13930 | 1.0977 |
0.0025 | 1.5 | 14925 | 1.0917 |
0.6355 | 1.6 | 15920 | 1.0630 |
0.5881 | 1.7 | 16915 | 1.0565 |
0.3181 | 1.8 | 17910 | 1.0568 |
0.9256 | 1.9 | 18905 | 1.0623 |
4.5318 | 2.0 | 19900 | 1.0678 |
0.8736 | 2.1 | 20895 | 1.0645 |
2.2079 | 2.2 | 21890 | 1.0474 |
2.7407 | 2.3 | 22885 | 1.0438 |
2.2308 | 2.4 | 23880 | 1.0485 |
0.4307 | 2.5 | 24875 | 1.0235 |
0.2956 | 2.6 | 25870 | 1.0201 |
0.203 | 2.7 | 26865 | 1.0200 |
2.2452 | 2.8 | 27860 | 1.0243 |
0.942 | 2.9 | 28855 | 1.0289 |
0.0069 | 3.0 | 29850 | 1.0181 |
3.2121 | 3.1 | 30845 | 1.0235 |
1.4533 | 3.2 | 31840 | 1.0127 |
0.208 | 3.3 | 32835 | 1.0110 |
0.1379 | 3.4 | 33830 | 1.0126 |
0.1991 | 3.5 | 34825 | 1.0103 |
1.3019 | 3.6 | 35820 | 1.0154 |
0.6602 | 3.7 | 36815 | 1.0178 |
0.5271 | 3.8 | 37810 | 1.0087 |
0.3131 | 3.9 | 38805 | 1.0092 |
3.6821 | 4.0 | 39800 | 1.0094 |
0.3724 | 4.1 | 40795 | 1.0093 |
0.0704 | 4.2 | 41790 | 1.0081 |
0.1209 | 4.3 | 42785 | 1.0108 |
0.9807 | 4.4 | 43780 | 1.0072 |
0.1392 | 4.5 | 44775 | 1.0078 |
0.2561 | 4.6 | 45770 | 1.0078 |
0.1533 | 4.7 | 46765 | 1.0089 |
0.4302 | 4.8 | 47760 | 1.0079 |
1.3744 | 4.9 | 48755 | 1.0074 |
0.8572 | 5.0 | 49750 | 1.0079 |
Framework versions
- Transformers 4.35.1
- Pytorch 2.0.1+cu118
- Datasets 2.14.5
- Tokenizers 0.14.1