phi-3-mini-LoRA

This model is a fine-tuned version of microsoft/Phi-3-mini-4k-instruct on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.5538

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 4
eval_batch_size: 4
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 16
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.1
num_epochs: 3

Training results

Training Loss	Epoch	Step	Validation Loss
0.9859	0.0905	100	0.7611
0.6236	0.1810	200	0.5919
0.5846	0.2714	300	0.5761
0.5813	0.3619	400	0.5696
0.5696	0.4524	500	0.5659
0.5616	0.5429	600	0.5638
0.5682	0.6333	700	0.5621
0.5683	0.7238	800	0.5615
0.5532	0.8143	900	0.5599
0.5571	0.9048	1000	0.5596
0.5695	0.9952	1100	0.5586
0.5547	1.0857	1200	0.5578
0.5524	1.1762	1300	0.5574
0.5458	1.2667	1400	0.5568
0.5447	1.3572	1500	0.5563
0.5566	1.4476	1600	0.5561
0.5678	1.5381	1700	0.5557
0.559	1.6286	1800	0.5553
0.5528	1.7191	1900	0.5556
0.5523	1.8095	2000	0.5548
0.5481	1.9000	2100	0.5550
0.5545	1.9905	2200	0.5546
0.5412	2.0810	2300	0.5544
0.5449	2.1715	2400	0.5543
0.5657	2.2619	2500	0.5543
0.5484	2.3524	2600	0.5541
0.5553	2.4429	2700	0.5540
0.5398	2.5334	2800	0.5540
0.5488	2.6238	2900	0.5537
0.5484	2.7143	3000	0.5538
0.5512	2.8048	3100	0.5538
0.5493	2.8953	3200	0.5537
0.5404	2.9857	3300	0.5538

Framework versions

PEFT 0.11.1
Transformers 4.42.4
Pytorch 2.3.1+cu121
Datasets 2.20.0
Tokenizers 0.19.1

pzs26401
/

phi-3-mini-LoRA

phi-3-mini-LoRA

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for pzs26401/phi-3-mini-LoRA

Evaluation results