daniel-train-test1
This model is a fine-tuned version of allenai/OLMo-1B-hf on the generator dataset. It achieves the following results on the evaluation set:
- Loss: 0.0894
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 32
- eval_batch_size: 32
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 64
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine_with_restarts
- lr_scheduler_warmup_ratio: 0.05
- num_epochs: 1.0
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
1.216 | 0.0251 | 26 | 1.1370 |
0.207 | 0.0503 | 52 | 0.2054 |
0.1815 | 0.0754 | 78 | 0.1577 |
0.1741 | 0.1006 | 104 | 0.1424 |
0.1282 | 0.1257 | 130 | 0.1351 |
0.1276 | 0.1509 | 156 | 0.1280 |
0.1256 | 0.1760 | 182 | 0.1228 |
0.1047 | 0.2012 | 208 | 0.1194 |
0.1006 | 0.2263 | 234 | 0.1169 |
0.0934 | 0.2515 | 260 | 0.1140 |
0.161 | 0.2766 | 286 | 0.1111 |
0.105 | 0.3017 | 312 | 0.1114 |
0.0928 | 0.3269 | 338 | 0.1106 |
0.1078 | 0.3520 | 364 | 0.1101 |
0.1191 | 0.3772 | 390 | 0.1076 |
0.0895 | 0.4023 | 416 | 0.1057 |
0.1124 | 0.4275 | 442 | 0.1050 |
0.0939 | 0.4526 | 468 | 0.1034 |
0.0961 | 0.4778 | 494 | 0.1024 |
0.1109 | 0.5029 | 520 | 0.1031 |
0.1207 | 0.5280 | 546 | 0.1026 |
0.0755 | 0.5532 | 572 | 0.0994 |
0.0949 | 0.5783 | 598 | 0.0980 |
0.0901 | 0.6035 | 624 | 0.0971 |
0.085 | 0.6286 | 650 | 0.0961 |
0.0794 | 0.6538 | 676 | 0.0951 |
0.0726 | 0.6789 | 702 | 0.0944 |
0.075 | 0.7041 | 728 | 0.0933 |
0.1057 | 0.7292 | 754 | 0.0927 |
0.0856 | 0.7544 | 780 | 0.0919 |
0.0688 | 0.7795 | 806 | 0.0911 |
0.1177 | 0.8046 | 832 | 0.0911 |
0.0888 | 0.8298 | 858 | 0.0908 |
0.0959 | 0.8549 | 884 | 0.0903 |
0.0827 | 0.8801 | 910 | 0.0899 |
0.0629 | 0.9052 | 936 | 0.0896 |
0.1093 | 0.9304 | 962 | 0.0895 |
0.0882 | 0.9555 | 988 | 0.0895 |
0.0859 | 0.9807 | 1014 | 0.0894 |
Framework versions
- PEFT 0.10.0
- Transformers 4.40.1
- Pytorch 2.2.1+cu121
- Datasets 2.19.1
- Tokenizers 0.19.1
- Downloads last month
- 154
Unable to determine this model’s pipeline type. Check the
docs
.