Edit model card

phi-3-mini-LoRA

This model is a fine-tuned version of microsoft/Phi-3-mini-4k-instruct on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6528

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 128
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss
0.7804 0.2203 100 0.6979
0.6811 0.4405 200 0.6706
0.6681 0.6608 300 0.6644
0.6622 0.8811 400 0.6613
0.6602 1.1013 500 0.6592
0.6581 1.3216 600 0.6576
0.6564 1.5419 700 0.6563
0.6557 1.7621 800 0.6553
0.6541 1.9824 900 0.6545
0.6531 2.2026 1000 0.6540
0.6506 2.4229 1100 0.6534
0.651 2.6432 1200 0.6530
0.6512 2.8634 1300 0.6528

Framework versions

  • PEFT 0.12.0
  • Transformers 4.43.3
  • Pytorch 2.3.1+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
0
Unable to determine this model’s pipeline type. Check the docs .

Adapter for