working / README.md
Narkantak's picture
Narkantak/phi3-Intent-entity-Classifier-Ashuv2
604bd31
|
raw
history blame
2.95 kB
metadata
license: mit
library_name: peft
tags:
  - generated_from_trainer
base_model: microsoft/Phi-3-mini-128k-instruct
model-index:
  - name: working
    results: []

working

This model is a fine-tuned version of microsoft/Phi-3-mini-128k-instruct on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4364

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 6
  • eval_batch_size: 6
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 24
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 2
  • num_epochs: 30
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
2.8501 1.0 3 2.2605
2.1387 2.0 6 1.7344
1.5826 3.0 9 1.3666
1.2187 4.0 12 1.0485
0.8879 5.0 15 0.7558
0.6134 6.0 18 0.5396
0.4343 7.0 21 0.4304
0.3557 8.0 24 0.3943
0.3205 9.0 27 0.3689
0.2947 10.0 30 0.3580
0.2727 11.0 33 0.3371
0.2506 12.0 36 0.3361
0.2291 13.0 39 0.3342
0.2098 14.0 42 0.3332
0.1911 15.0 45 0.3446
0.1761 16.0 48 0.3334
0.159 17.0 51 0.3453
0.1399 18.0 54 0.3540
0.124 19.0 57 0.3631
0.1123 20.0 60 0.3636
0.0992 21.0 63 0.3778
0.0862 22.0 66 0.3862
0.0783 23.0 69 0.3966
0.0704 24.0 72 0.4072
0.0627 25.0 75 0.4178
0.0582 26.0 78 0.4200
0.0553 27.0 81 0.4283
0.0521 28.0 84 0.4338
0.0505 29.0 87 0.4366
0.0494 30.0 90 0.4364

Framework versions

  • PEFT 0.10.0
  • Transformers 4.39.3
  • Pytorch 2.1.2
  • Datasets 2.18.0
  • Tokenizers 0.15.2