phi-3-mini-LoRA
This model is a fine-tuned version of microsoft/Phi-3-mini-4k-instruct on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.5513
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 2
- eval_batch_size: 2
- seed: 42
- gradient_accumulation_steps: 4
- total_train_batch_size: 8
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 3
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
1.0558 | 0.0452 | 100 | 0.9096 |
0.702 | 0.0905 | 200 | 0.6207 |
0.5897 | 0.1357 | 300 | 0.5882 |
0.5837 | 0.1810 | 400 | 0.5784 |
0.5862 | 0.2262 | 500 | 0.5761 |
0.5758 | 0.2715 | 600 | 0.5718 |
0.5738 | 0.3167 | 700 | 0.5703 |
0.5874 | 0.3620 | 800 | 0.5684 |
0.5621 | 0.4072 | 900 | 0.5667 |
0.5769 | 0.4524 | 1000 | 0.5648 |
0.5739 | 0.4977 | 1100 | 0.5645 |
0.5526 | 0.5429 | 1200 | 0.5631 |
0.5765 | 0.5882 | 1300 | 0.5622 |
0.5609 | 0.6334 | 1400 | 0.5613 |
0.5714 | 0.6787 | 1500 | 0.5609 |
0.5646 | 0.7239 | 1600 | 0.5614 |
0.5652 | 0.7691 | 1700 | 0.5602 |
0.5488 | 0.8144 | 1800 | 0.5604 |
0.5629 | 0.8596 | 1900 | 0.5597 |
0.5546 | 0.9049 | 2000 | 0.5585 |
0.577 | 0.9501 | 2100 | 0.5582 |
0.5597 | 0.9954 | 2200 | 0.5579 |
0.5538 | 1.0406 | 2300 | 0.5575 |
0.5483 | 1.0859 | 2400 | 0.5575 |
0.5585 | 1.1311 | 2500 | 0.5572 |
0.5485 | 1.1763 | 2600 | 0.5573 |
0.5502 | 1.2216 | 2700 | 0.5573 |
0.5401 | 1.2668 | 2800 | 0.5563 |
0.5484 | 1.3121 | 2900 | 0.5559 |
0.5429 | 1.3573 | 3000 | 0.5558 |
0.5617 | 1.4026 | 3100 | 0.5551 |
0.5483 | 1.4478 | 3200 | 0.5550 |
0.5678 | 1.4930 | 3300 | 0.5548 |
0.5533 | 1.5383 | 3400 | 0.5544 |
0.5586 | 1.5835 | 3500 | 0.5549 |
0.5537 | 1.6288 | 3600 | 0.5541 |
0.5512 | 1.6740 | 3700 | 0.5542 |
0.554 | 1.7193 | 3800 | 0.5539 |
0.5626 | 1.7645 | 3900 | 0.5536 |
0.5329 | 1.8098 | 4000 | 0.5533 |
0.5389 | 1.8550 | 4100 | 0.5532 |
0.5526 | 1.9002 | 4200 | 0.5532 |
0.5542 | 1.9455 | 4300 | 0.5527 |
0.5528 | 1.9907 | 4400 | 0.5532 |
0.5335 | 2.0360 | 4500 | 0.5531 |
0.543 | 2.0812 | 4600 | 0.5538 |
0.5452 | 2.1265 | 4700 | 0.5531 |
0.5352 | 2.1717 | 4800 | 0.5527 |
0.5507 | 2.2169 | 4900 | 0.5526 |
0.5553 | 2.2622 | 5000 | 0.5524 |
0.5395 | 2.3074 | 5100 | 0.5525 |
0.54 | 2.3527 | 5200 | 0.5523 |
0.5329 | 2.3979 | 5300 | 0.5521 |
0.5628 | 2.4432 | 5400 | 0.5521 |
0.542 | 2.4884 | 5500 | 0.5522 |
0.5244 | 2.5337 | 5600 | 0.5520 |
0.5348 | 2.5789 | 5700 | 0.5519 |
0.5465 | 2.6241 | 5800 | 0.5519 |
0.5392 | 2.6694 | 5900 | 0.5518 |
0.5412 | 2.7146 | 6000 | 0.5518 |
0.5497 | 2.7599 | 6100 | 0.5517 |
0.5354 | 2.8051 | 6200 | 0.5514 |
0.534 | 2.8504 | 6300 | 0.5515 |
0.5478 | 2.8956 | 6400 | 0.5514 |
0.5302 | 2.9408 | 6500 | 0.5513 |
0.5366 | 2.9861 | 6600 | 0.5513 |
Framework versions
- PEFT 0.12.0
- Transformers 4.43.1
- Pytorch 2.4.0a0+3bcc3cddb5.nv24.07
- Datasets 2.20.0
- Tokenizers 0.19.1
- Downloads last month
- 2
Unable to determine this model’s pipeline type. Check the
docs
.