phi-1_5-lora-tuned-sft-dolly_hitesh

This model is a fine-tuned version of microsoft/phi-1_5 on the generator dataset. It achieves the following results on the evaluation set:

  • Loss: 2.3164

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Hardware

Trained model on Intel Max 1550 GPU

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 2
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.05
  • training_steps: 1480

Training results

Training Loss Epoch Step Validation Loss
2.8614 1.6129 100 2.6779
2.6089 3.2258 200 2.5131
2.5117 4.8387 300 2.4545
2.4636 6.4516 400 2.4229
2.4367 8.0645 500 2.3990
2.4091 9.6774 600 2.3761
2.389 11.2903 700 2.3553
2.3639 12.9032 800 2.3394
2.3541 14.5161 900 2.3299
2.3418 16.1290 1000 2.3241
2.3395 17.7419 1100 2.3209
2.3319 19.3548 1200 2.3186
2.3363 20.9677 1300 2.3171
2.3327 22.5806 1400 2.3164

Framework versions

  • PEFT 0.11.1
  • Transformers 4.41.2
  • Pytorch 2.1.0.post0+cxx11.abi
  • Datasets 2.19.1
  • Tokenizers 0.19.1
Downloads last month
1
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for HiteshJ14/phi-1_5-lora-tuned-sft-dolly_hitesh

Base model

microsoft/phi-1_5
Adapter
(238)
this model