Edit model card

phi3mini_4k_i_RE_QA_alpha8_r_8

This model is a fine-tuned version of microsoft/Phi-3-mini-4k-instruct on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4399

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 128
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss
1.7043 0.1187 100 0.5258
0.5348 0.2374 200 0.4702
0.5086 0.3561 300 0.4555
0.4973 0.4748 400 0.4484
0.4909 0.5935 500 0.4446
0.4842 0.7122 600 0.4419
0.4802 0.8309 700 0.4406
0.4797 0.9496 800 0.4399

Framework versions

  • PEFT 0.11.1
  • Transformers 4.41.2
  • Pytorch 2.2.1
  • Datasets 2.19.2
  • Tokenizers 0.19.1
Downloads last month
0
Inference API
Unable to determine this model’s pipeline type. Check the docs .