Edit model card

Visualize in Weights & Biases

phi-3-mini-LoRA

This model is a fine-tuned version of microsoft/Phi-3-mini-4k-instruct on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5513

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 8
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss
1.0558 0.0452 100 0.9096
0.702 0.0905 200 0.6207
0.5897 0.1357 300 0.5882
0.5837 0.1810 400 0.5784
0.5862 0.2262 500 0.5761
0.5758 0.2715 600 0.5718
0.5738 0.3167 700 0.5703
0.5874 0.3620 800 0.5684
0.5621 0.4072 900 0.5667
0.5769 0.4524 1000 0.5648
0.5739 0.4977 1100 0.5645
0.5526 0.5429 1200 0.5631
0.5765 0.5882 1300 0.5622
0.5609 0.6334 1400 0.5613
0.5714 0.6787 1500 0.5609
0.5646 0.7239 1600 0.5614
0.5652 0.7691 1700 0.5602
0.5488 0.8144 1800 0.5604
0.5629 0.8596 1900 0.5597
0.5546 0.9049 2000 0.5585
0.577 0.9501 2100 0.5582
0.5597 0.9954 2200 0.5579
0.5538 1.0406 2300 0.5575
0.5483 1.0859 2400 0.5575
0.5585 1.1311 2500 0.5572
0.5485 1.1763 2600 0.5573
0.5502 1.2216 2700 0.5573
0.5401 1.2668 2800 0.5563
0.5484 1.3121 2900 0.5559
0.5429 1.3573 3000 0.5558
0.5617 1.4026 3100 0.5551
0.5483 1.4478 3200 0.5550
0.5678 1.4930 3300 0.5548
0.5533 1.5383 3400 0.5544
0.5586 1.5835 3500 0.5549
0.5537 1.6288 3600 0.5541
0.5512 1.6740 3700 0.5542
0.554 1.7193 3800 0.5539
0.5626 1.7645 3900 0.5536
0.5329 1.8098 4000 0.5533
0.5389 1.8550 4100 0.5532
0.5526 1.9002 4200 0.5532
0.5542 1.9455 4300 0.5527
0.5528 1.9907 4400 0.5532
0.5335 2.0360 4500 0.5531
0.543 2.0812 4600 0.5538
0.5452 2.1265 4700 0.5531
0.5352 2.1717 4800 0.5527
0.5507 2.2169 4900 0.5526
0.5553 2.2622 5000 0.5524
0.5395 2.3074 5100 0.5525
0.54 2.3527 5200 0.5523
0.5329 2.3979 5300 0.5521
0.5628 2.4432 5400 0.5521
0.542 2.4884 5500 0.5522
0.5244 2.5337 5600 0.5520
0.5348 2.5789 5700 0.5519
0.5465 2.6241 5800 0.5519
0.5392 2.6694 5900 0.5518
0.5412 2.7146 6000 0.5518
0.5497 2.7599 6100 0.5517
0.5354 2.8051 6200 0.5514
0.534 2.8504 6300 0.5515
0.5478 2.8956 6400 0.5514
0.5302 2.9408 6500 0.5513
0.5366 2.9861 6600 0.5513

Framework versions

  • PEFT 0.12.0
  • Transformers 4.43.1
  • Pytorch 2.4.0a0+3bcc3cddb5.nv24.07
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
2
Unable to determine this model’s pipeline type. Check the docs .

Adapter for