LoLlama3.2-1B-lora-50ep
This model is a fine-tuned version of meta-llama/Llama-3.2-1B on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 2.3046
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 2
- eval_batch_size: 2
- seed: 42
- gradient_accumulation_steps: 4
- total_train_batch_size: 8
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 50
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
3.1901 | 1.0 | 847 | 2.9623 |
2.8573 | 2.0 | 1694 | 2.8494 |
2.7574 | 3.0 | 2541 | 2.7774 |
2.7042 | 4.0 | 3388 | 2.7245 |
2.6395 | 5.0 | 4235 | 2.6837 |
2.5853 | 6.0 | 5082 | 2.6469 |
2.5659 | 7.0 | 5929 | 2.6132 |
2.5259 | 8.0 | 6776 | 2.5871 |
2.4823 | 9.0 | 7623 | 2.5630 |
2.4641 | 10.0 | 8470 | 2.5402 |
2.4462 | 11.0 | 9317 | 2.5189 |
2.4094 | 12.0 | 10164 | 2.5011 |
2.3841 | 13.0 | 11011 | 2.4873 |
2.3452 | 14.0 | 11858 | 2.4717 |
2.3421 | 15.0 | 12705 | 2.4554 |
2.3139 | 16.0 | 13552 | 2.4428 |
2.2926 | 17.0 | 14399 | 2.4322 |
2.2811 | 18.0 | 15246 | 2.4227 |
2.2511 | 19.0 | 16093 | 2.4116 |
2.2482 | 20.0 | 16940 | 2.4009 |
2.2302 | 21.0 | 17787 | 2.3951 |
2.2247 | 22.0 | 18634 | 2.3844 |
2.2074 | 23.0 | 19481 | 2.3793 |
2.1883 | 24.0 | 20328 | 2.3714 |
2.1753 | 25.0 | 21175 | 2.3673 |
2.1767 | 26.0 | 22022 | 2.3583 |
2.1454 | 27.0 | 22869 | 2.3511 |
2.1383 | 28.0 | 23716 | 2.3510 |
2.1409 | 29.0 | 24563 | 2.3435 |
2.1369 | 30.0 | 25410 | 2.3419 |
2.1175 | 31.0 | 26257 | 2.3340 |
2.1096 | 32.0 | 27104 | 2.3332 |
2.0996 | 33.0 | 27951 | 2.3299 |
2.0994 | 34.0 | 28798 | 2.3244 |
2.0936 | 35.0 | 29645 | 2.3205 |
2.0688 | 36.0 | 30492 | 2.3200 |
2.0898 | 37.0 | 31339 | 2.3184 |
2.0695 | 38.0 | 32186 | 2.3145 |
2.0765 | 39.0 | 33033 | 2.3139 |
2.0651 | 40.0 | 33880 | 2.3155 |
2.0497 | 41.0 | 34727 | 2.3105 |
2.0614 | 42.0 | 35574 | 2.3077 |
2.0519 | 43.0 | 36421 | 2.3067 |
2.0493 | 44.0 | 37268 | 2.3071 |
2.044 | 45.0 | 38115 | 2.3046 |
2.0491 | 46.0 | 38962 | 2.3054 |
2.052 | 47.0 | 39809 | 2.3037 |
2.0526 | 48.0 | 40656 | 2.3049 |
2.0319 | 49.0 | 41503 | 2.3044 |
2.0274 | 50.0 | 42350 | 2.3046 |
Framework versions
- PEFT 0.14.0
- Transformers 4.49.0
- Pytorch 2.6.0+cu124
- Datasets 3.3.2
- Tokenizers 0.21.0
- Downloads last month
- 685
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
HF Inference deployability: The model has no pipeline_tag.
Model tree for avinot/LoLlama3.2-1B-lora-50ep
Base model
meta-llama/Llama-3.2-1B