Edit model card

daniel-train-test1

This model is a fine-tuned version of allenai/OLMo-1B-hf on the generator dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0894

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine_with_restarts
  • lr_scheduler_warmup_ratio: 0.05
  • num_epochs: 1.0

Training results

Training Loss Epoch Step Validation Loss
1.216 0.0251 26 1.1370
0.207 0.0503 52 0.2054
0.1815 0.0754 78 0.1577
0.1741 0.1006 104 0.1424
0.1282 0.1257 130 0.1351
0.1276 0.1509 156 0.1280
0.1256 0.1760 182 0.1228
0.1047 0.2012 208 0.1194
0.1006 0.2263 234 0.1169
0.0934 0.2515 260 0.1140
0.161 0.2766 286 0.1111
0.105 0.3017 312 0.1114
0.0928 0.3269 338 0.1106
0.1078 0.3520 364 0.1101
0.1191 0.3772 390 0.1076
0.0895 0.4023 416 0.1057
0.1124 0.4275 442 0.1050
0.0939 0.4526 468 0.1034
0.0961 0.4778 494 0.1024
0.1109 0.5029 520 0.1031
0.1207 0.5280 546 0.1026
0.0755 0.5532 572 0.0994
0.0949 0.5783 598 0.0980
0.0901 0.6035 624 0.0971
0.085 0.6286 650 0.0961
0.0794 0.6538 676 0.0951
0.0726 0.6789 702 0.0944
0.075 0.7041 728 0.0933
0.1057 0.7292 754 0.0927
0.0856 0.7544 780 0.0919
0.0688 0.7795 806 0.0911
0.1177 0.8046 832 0.0911
0.0888 0.8298 858 0.0908
0.0959 0.8549 884 0.0903
0.0827 0.8801 910 0.0899
0.0629 0.9052 936 0.0896
0.1093 0.9304 962 0.0895
0.0882 0.9555 988 0.0895
0.0859 0.9807 1014 0.0894

Framework versions

  • PEFT 0.10.0
  • Transformers 4.40.1
  • Pytorch 2.2.1+cu121
  • Datasets 2.19.1
  • Tokenizers 0.19.1
Downloads last month
154
Unable to determine this model’s pipeline type. Check the docs .

Adapter for