Edit model card

Visualize in Weights & Biases

phi-3-mini-LoRA

This model is a fine-tuned version of microsoft/Phi-3-mini-4k-instruct on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.2015

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss
1.9715 0.0242 100 1.9517
1.8637 0.0484 200 1.7875
1.7019 0.0725 300 1.6473
1.6127 0.0967 400 1.5828
1.5545 0.1209 500 1.5389
1.5144 0.1451 600 1.5045
1.4823 0.1693 700 1.4746
1.4535 0.1935 800 1.4502
1.4293 0.2176 900 1.4270
1.4132 0.2418 1000 1.4073
1.388 0.2660 1100 1.3880
1.3757 0.2902 1200 1.3706
1.3594 0.3144 1300 1.3543
1.3399 0.3386 1400 1.3410
1.3314 0.3627 1500 1.3284
1.3161 0.3869 1600 1.3167
1.3005 0.4111 1700 1.3084
1.2937 0.4353 1800 1.2987
1.2824 0.4595 1900 1.2920
1.2806 0.4836 2000 1.2859
1.2773 0.5078 2100 1.2793
1.2717 0.5320 2200 1.2738
1.2654 0.5562 2300 1.2692
1.2597 0.5804 2400 1.2644
1.2536 0.6046 2500 1.2601
1.2486 0.6287 2600 1.2560
1.2416 0.6529 2700 1.2527
1.2462 0.6771 2800 1.2494
1.2402 0.7013 2900 1.2465
1.2353 0.7255 3000 1.2434
1.2285 0.7497 3100 1.2410
1.2314 0.7738 3200 1.2384
1.2342 0.7980 3300 1.2357
1.2195 0.8222 3400 1.2339
1.2306 0.8464 3500 1.2316
1.2225 0.8706 3600 1.2301
1.2174 0.8947 3700 1.2281
1.2293 0.9189 3800 1.2267
1.2194 0.9431 3900 1.2250
1.2169 0.9673 4000 1.2234
1.2138 0.9915 4100 1.2224
1.2105 1.0157 4200 1.2214
1.2081 1.0398 4300 1.2201
1.2129 1.0640 4400 1.2188
1.1995 1.0882 4500 1.2177
1.196 1.1124 4600 1.2167
1.2041 1.1366 4700 1.2163
1.2104 1.1608 4800 1.2151
1.205 1.1849 4900 1.2144
1.2055 1.2091 5000 1.2135
1.1966 1.2333 5100 1.2128
1.2017 1.2575 5200 1.2120
1.1995 1.2817 5300 1.2117
1.2015 1.3058 5400 1.2108
1.1978 1.3300 5500 1.2103
1.2017 1.3542 5600 1.2098
1.196 1.3784 5700 1.2093
1.1976 1.4026 5800 1.2089
1.2057 1.4268 5900 1.2082
1.2012 1.4509 6000 1.2079
1.2067 1.4751 6100 1.2074
1.2048 1.4993 6200 1.2071
1.2011 1.5235 6300 1.2068
1.1911 1.5477 6400 1.2064
1.1974 1.5719 6500 1.2061
1.1934 1.5960 6600 1.2059
1.1896 1.6202 6700 1.2057
1.1895 1.6444 6800 1.2052
1.203 1.6686 6900 1.2051
1.191 1.6928 7000 1.2048
1.1995 1.7169 7100 1.2045
1.1979 1.7411 7200 1.2043
1.1918 1.7653 7300 1.2042
1.1969 1.7895 7400 1.2040
1.1869 1.8137 7500 1.2038
1.1871 1.8379 7600 1.2036
1.1988 1.8620 7700 1.2035
1.1942 1.8862 7800 1.2034
1.1931 1.9104 7900 1.2033
1.1947 1.9346 8000 1.2030
1.1932 1.9588 8100 1.2030
1.1922 1.9830 8200 1.2028
1.192 2.0071 8300 1.2027
1.1997 2.0313 8400 1.2027
1.1945 2.0555 8500 1.2026
1.1934 2.0797 8600 1.2026
1.1955 2.1039 8700 1.2024
1.1901 2.1280 8800 1.2024
1.1898 2.1522 8900 1.2023
1.186 2.1764 9000 1.2022
1.1858 2.2006 9100 1.2022
1.1965 2.2248 9200 1.2021
1.1835 2.2490 9300 1.2021
1.1983 2.2731 9400 1.2020
1.1813 2.2973 9500 1.2020
1.1903 2.3215 9600 1.2019
1.1952 2.3457 9700 1.2019
1.1899 2.3699 9800 1.2018
1.2011 2.3941 9900 1.2018
1.1936 2.4182 10000 1.2018
1.1931 2.4424 10100 1.2018
1.1991 2.4666 10200 1.2017
1.19 2.4908 10300 1.2017
1.1913 2.5150 10400 1.2016
1.1886 2.5391 10500 1.2017
1.1848 2.5633 10600 1.2016
1.1875 2.5875 10700 1.2016
1.1887 2.6117 10800 1.2016
1.1866 2.6359 10900 1.2016
1.188 2.6601 11000 1.2016
1.1952 2.6842 11100 1.2015
1.1947 2.7084 11200 1.2015
1.1905 2.7326 11300 1.2015
1.1838 2.7568 11400 1.2015
1.1893 2.7810 11500 1.2015
1.1808 2.8052 11600 1.2015
1.1909 2.8293 11700 1.2015
1.1858 2.8535 11800 1.2015
1.185 2.8777 11900 1.2015
1.1947 2.9019 12000 1.2015
1.1868 2.9261 12100 1.2014
1.1872 2.9502 12200 1.2015
1.1852 2.9744 12300 1.2015
1.185 2.9986 12400 1.2015

Framework versions

  • PEFT 0.11.1
  • Transformers 4.43.1
  • Pytorch 2.3.1+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
5
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for alizaidi/phi-3-mini-LoRA

Adapter
(272)
this model