guru1984-v2

This model is a fine-tuned version of microsoft/Phi-3.5-mini-instruct on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.7375

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0005
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss
4.2186 0.0393 50 3.8618
3.1772 0.0786 100 2.5934
2.401 0.1178 150 2.2560
2.1397 0.1571 200 2.1369
2.0834 0.1964 250 2.0805
2.055 0.2357 300 2.0563
2.043 0.2749 350 2.0286
2.0135 0.3142 400 2.0177
1.9971 0.3535 450 2.0020
1.9766 0.3928 500 1.9914
1.9677 0.4321 550 1.9789
1.9562 0.4713 600 1.9680
1.9594 0.5106 650 1.9631
1.9423 0.5499 700 1.9546
1.9587 0.5892 750 1.9470
1.9408 0.6284 800 1.9397
1.9816 0.6677 850 1.9425
1.9298 0.7070 900 1.9177
1.9021 0.7463 950 1.9150
1.9104 0.7855 1000 1.9072
1.9325 0.8248 1050 1.8993
1.9183 0.8641 1100 1.9054
1.9557 0.9034 1150 1.8948
1.9261 0.9427 1200 1.8823
1.9337 0.9819 1250 1.8785
1.9034 1.0212 1300 1.8770
1.8603 1.0605 1350 1.8668
1.8477 1.0998 1400 1.8662
1.8658 1.1390 1450 1.8574
1.8923 1.1783 1500 1.8574
1.8777 1.2176 1550 1.8603
1.8645 1.2569 1600 1.8517
1.8204 1.2962 1650 1.8447
1.8661 1.3354 1700 1.8400
1.8595 1.3747 1750 1.8384
1.857 1.4140 1800 1.8314
1.8431 1.4533 1850 1.8279
1.8249 1.4925 1900 1.8285
1.8372 1.5318 1950 1.8243
1.8589 1.5711 2000 1.8210
1.829 1.6104 2050 1.8053
1.8154 1.6496 2100 1.8002
1.8122 1.6889 2150 1.8008
1.8297 1.7282 2200 1.7969
1.8467 1.7675 2250 1.7963
1.8242 1.8068 2300 1.7973
1.8209 1.8460 2350 1.7902
1.8193 1.8853 2400 1.7890
1.8153 1.9246 2450 1.7839
1.7845 1.9639 2500 1.7780
1.7975 2.0031 2550 1.7794
1.7922 2.0424 2600 1.7733
1.7558 2.0817 2650 1.7721
1.7821 2.1210 2700 1.7694
1.7735 2.1603 2750 1.7644
1.7802 2.1995 2800 1.7630
1.7616 2.2388 2850 1.7603
1.7751 2.2781 2900 1.7580
1.7811 2.3174 2950 1.7550
1.7356 2.3566 3000 1.7529
1.7575 2.3959 3050 1.7514
1.7547 2.4352 3100 1.7510
1.7699 2.4745 3150 1.7522
1.7506 2.5137 3200 1.7496
1.7564 2.5530 3250 1.7441
1.7517 2.5923 3300 1.7436
1.7371 2.6316 3350 1.7433
1.7425 2.6709 3400 1.7430
1.7407 2.7101 3450 1.7402
1.7513 2.7494 3500 1.7408
1.7662 2.7887 3550 1.7384
1.7557 2.8280 3600 1.7397
1.7557 2.8672 3650 1.7405
1.753 2.9065 3700 1.7404
1.7788 2.9458 3750 1.7381
1.7539 2.9851 3800 1.7375

Framework versions

  • PEFT 0.12.0
  • Transformers 4.44.0
  • Pytorch 2.4.0
  • Datasets 2.21.0
  • Tokenizers 0.19.1
Downloads last month
0
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for emdemor/guru1984-v2

Adapter
(125)
this model