results
This model is a fine-tuned version of jeff31415/TinyLlama-1.1B-1T-OpenOrca on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 1.5156
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 9e-07
- train_batch_size: 20
- eval_batch_size: 4
- seed: 42
- gradient_accumulation_steps: 4
- total_train_batch_size: 80
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 3
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
2.1726 | 0.03 | 8 | 2.3170 |
2.1444 | 0.05 | 16 | 2.2937 |
2.1036 | 0.08 | 24 | 2.2707 |
2.0703 | 0.1 | 32 | 2.2478 |
2.0604 | 0.13 | 40 | 2.2248 |
2.046 | 0.15 | 48 | 2.2013 |
1.9919 | 0.18 | 56 | 2.1780 |
1.9842 | 0.21 | 64 | 2.1547 |
1.9234 | 0.23 | 72 | 2.1320 |
1.9235 | 0.26 | 80 | 2.1099 |
1.9096 | 0.28 | 88 | 2.0884 |
1.8722 | 0.31 | 96 | 2.0679 |
1.8594 | 0.34 | 104 | 2.0479 |
1.8438 | 0.36 | 112 | 2.0283 |
1.7581 | 0.39 | 120 | 2.0089 |
1.7852 | 0.41 | 128 | 1.9901 |
1.7634 | 0.44 | 136 | 1.9714 |
1.7296 | 0.46 | 144 | 1.9531 |
1.6976 | 0.49 | 152 | 1.9353 |
1.6861 | 0.52 | 160 | 1.9173 |
1.6683 | 0.54 | 168 | 1.8993 |
1.6255 | 0.57 | 176 | 1.8826 |
1.619 | 0.59 | 184 | 1.8673 |
1.6455 | 0.62 | 192 | 1.8534 |
1.5784 | 0.65 | 200 | 1.8399 |
1.6078 | 0.67 | 208 | 1.8259 |
1.5703 | 0.7 | 216 | 1.8124 |
1.5215 | 0.72 | 224 | 1.7989 |
1.542 | 0.75 | 232 | 1.7852 |
1.5147 | 0.77 | 240 | 1.7721 |
1.5092 | 0.8 | 248 | 1.7589 |
1.4564 | 0.83 | 256 | 1.7456 |
1.4985 | 0.85 | 264 | 1.7324 |
1.4505 | 0.88 | 272 | 1.7189 |
1.4447 | 0.9 | 280 | 1.7052 |
1.4436 | 0.93 | 288 | 1.6924 |
1.4132 | 0.95 | 296 | 1.6799 |
1.3791 | 0.98 | 304 | 1.6680 |
1.3877 | 1.01 | 312 | 1.6565 |
1.3807 | 1.03 | 320 | 1.6453 |
1.3391 | 1.06 | 328 | 1.6352 |
1.3232 | 1.08 | 336 | 1.6251 |
1.3293 | 1.11 | 344 | 1.6159 |
1.3029 | 1.14 | 352 | 1.6074 |
1.3173 | 1.16 | 360 | 1.5992 |
1.3006 | 1.19 | 368 | 1.5926 |
1.2547 | 1.21 | 376 | 1.5863 |
1.2704 | 1.24 | 384 | 1.5805 |
1.2964 | 1.26 | 392 | 1.5749 |
1.277 | 1.29 | 400 | 1.5695 |
1.2718 | 1.32 | 408 | 1.5657 |
1.2379 | 1.34 | 416 | 1.5619 |
1.2746 | 1.37 | 424 | 1.5585 |
1.2349 | 1.39 | 432 | 1.5559 |
1.2264 | 1.42 | 440 | 1.5531 |
1.2365 | 1.45 | 448 | 1.5505 |
1.2242 | 1.47 | 456 | 1.5484 |
1.2094 | 1.5 | 464 | 1.5462 |
1.2196 | 1.52 | 472 | 1.5444 |
1.2447 | 1.55 | 480 | 1.5426 |
1.2127 | 1.57 | 488 | 1.5407 |
1.2278 | 1.6 | 496 | 1.5391 |
1.2089 | 1.63 | 504 | 1.5377 |
1.2069 | 1.65 | 512 | 1.5361 |
1.2264 | 1.68 | 520 | 1.5350 |
1.2027 | 1.7 | 528 | 1.5338 |
1.2138 | 1.73 | 536 | 1.5325 |
1.207 | 1.75 | 544 | 1.5313 |
1.2155 | 1.78 | 552 | 1.5304 |
1.2192 | 1.81 | 560 | 1.5295 |
1.2223 | 1.83 | 568 | 1.5287 |
1.2281 | 1.86 | 576 | 1.5278 |
1.1977 | 1.88 | 584 | 1.5269 |
1.2101 | 1.91 | 592 | 1.5261 |
1.2099 | 1.94 | 600 | 1.5254 |
1.1873 | 1.96 | 608 | 1.5245 |
1.204 | 1.99 | 616 | 1.5242 |
1.21 | 2.01 | 624 | 1.5239 |
1.242 | 2.04 | 632 | 1.5231 |
1.1696 | 2.06 | 640 | 1.5224 |
1.1803 | 2.09 | 648 | 1.5218 |
1.1692 | 2.12 | 656 | 1.5213 |
1.212 | 2.14 | 664 | 1.5208 |
1.1977 | 2.17 | 672 | 1.5204 |
1.187 | 2.19 | 680 | 1.5201 |
1.1858 | 2.22 | 688 | 1.5199 |
1.1824 | 2.25 | 696 | 1.5194 |
1.1914 | 2.27 | 704 | 1.5190 |
1.1815 | 2.3 | 712 | 1.5187 |
1.2021 | 2.32 | 720 | 1.5184 |
1.1872 | 2.35 | 728 | 1.5181 |
1.1901 | 2.37 | 736 | 1.5178 |
1.1933 | 2.4 | 744 | 1.5177 |
1.1773 | 2.43 | 752 | 1.5175 |
1.1935 | 2.45 | 760 | 1.5172 |
1.2118 | 2.48 | 768 | 1.5170 |
1.1816 | 2.5 | 776 | 1.5169 |
1.1842 | 2.53 | 784 | 1.5167 |
1.1891 | 2.55 | 792 | 1.5165 |
1.1883 | 2.58 | 800 | 1.5164 |
1.1506 | 2.61 | 808 | 1.5163 |
1.1708 | 2.63 | 816 | 1.5162 |
1.1944 | 2.66 | 824 | 1.5160 |
1.1575 | 2.68 | 832 | 1.5159 |
1.1698 | 2.71 | 840 | 1.5160 |
1.1525 | 2.74 | 848 | 1.5158 |
1.1767 | 2.76 | 856 | 1.5157 |
1.1943 | 2.79 | 864 | 1.5158 |
1.1727 | 2.81 | 872 | 1.5157 |
1.195 | 2.84 | 880 | 1.5157 |
1.1771 | 2.86 | 888 | 1.5157 |
1.1731 | 2.89 | 896 | 1.5156 |
1.191 | 2.92 | 904 | 1.5157 |
1.1903 | 2.94 | 912 | 1.5156 |
1.1821 | 2.97 | 920 | 1.5156 |
1.2 | 2.99 | 928 | 1.5156 |
Framework versions
- Transformers 4.36.0.dev0
- Pytorch 2.1.0+cu118
- Datasets 2.15.1.dev0
- Tokenizers 0.15.0
- Downloads last month
- 131
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
Model tree for marcchew/TinyLLaMA-1.1B-OrcaPlatty
Base model
jeff31415/TinyLlama-1.1B-1T-OpenOrca