---
base_model: microsoft/Phi-3.5-mini-instruct
library_name: peft
license: mit
tags:
- trl
- sft
- generated_from_trainer
model-index:
- name: guru1984-v2
  results: []
pipeline_tag: text-generation
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# guru1984-v2

This model is a fine-tuned version of [microsoft/Phi-3.5-mini-instruct](https://huggingface.co/microsoft/Phi-3.5-mini-instruct) on an unknown dataset.
It achieves the following results on the evaluation set:
- Loss: 1.7375

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.0005
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 3

### Training results

| Training Loss | Epoch  | Step | Validation Loss |
|:-------------:|:------:|:----:|:---------------:|
| 4.2186        | 0.0393 | 50   | 3.8618          |
| 3.1772        | 0.0786 | 100  | 2.5934          |
| 2.401         | 0.1178 | 150  | 2.2560          |
| 2.1397        | 0.1571 | 200  | 2.1369          |
| 2.0834        | 0.1964 | 250  | 2.0805          |
| 2.055         | 0.2357 | 300  | 2.0563          |
| 2.043         | 0.2749 | 350  | 2.0286          |
| 2.0135        | 0.3142 | 400  | 2.0177          |
| 1.9971        | 0.3535 | 450  | 2.0020          |
| 1.9766        | 0.3928 | 500  | 1.9914          |
| 1.9677        | 0.4321 | 550  | 1.9789          |
| 1.9562        | 0.4713 | 600  | 1.9680          |
| 1.9594        | 0.5106 | 650  | 1.9631          |
| 1.9423        | 0.5499 | 700  | 1.9546          |
| 1.9587        | 0.5892 | 750  | 1.9470          |
| 1.9408        | 0.6284 | 800  | 1.9397          |
| 1.9816        | 0.6677 | 850  | 1.9425          |
| 1.9298        | 0.7070 | 900  | 1.9177          |
| 1.9021        | 0.7463 | 950  | 1.9150          |
| 1.9104        | 0.7855 | 1000 | 1.9072          |
| 1.9325        | 0.8248 | 1050 | 1.8993          |
| 1.9183        | 0.8641 | 1100 | 1.9054          |
| 1.9557        | 0.9034 | 1150 | 1.8948          |
| 1.9261        | 0.9427 | 1200 | 1.8823          |
| 1.9337        | 0.9819 | 1250 | 1.8785          |
| 1.9034        | 1.0212 | 1300 | 1.8770          |
| 1.8603        | 1.0605 | 1350 | 1.8668          |
| 1.8477        | 1.0998 | 1400 | 1.8662          |
| 1.8658        | 1.1390 | 1450 | 1.8574          |
| 1.8923        | 1.1783 | 1500 | 1.8574          |
| 1.8777        | 1.2176 | 1550 | 1.8603          |
| 1.8645        | 1.2569 | 1600 | 1.8517          |
| 1.8204        | 1.2962 | 1650 | 1.8447          |
| 1.8661        | 1.3354 | 1700 | 1.8400          |
| 1.8595        | 1.3747 | 1750 | 1.8384          |
| 1.857         | 1.4140 | 1800 | 1.8314          |
| 1.8431        | 1.4533 | 1850 | 1.8279          |
| 1.8249        | 1.4925 | 1900 | 1.8285          |
| 1.8372        | 1.5318 | 1950 | 1.8243          |
| 1.8589        | 1.5711 | 2000 | 1.8210          |
| 1.829         | 1.6104 | 2050 | 1.8053          |
| 1.8154        | 1.6496 | 2100 | 1.8002          |
| 1.8122        | 1.6889 | 2150 | 1.8008          |
| 1.8297        | 1.7282 | 2200 | 1.7969          |
| 1.8467        | 1.7675 | 2250 | 1.7963          |
| 1.8242        | 1.8068 | 2300 | 1.7973          |
| 1.8209        | 1.8460 | 2350 | 1.7902          |
| 1.8193        | 1.8853 | 2400 | 1.7890          |
| 1.8153        | 1.9246 | 2450 | 1.7839          |
| 1.7845        | 1.9639 | 2500 | 1.7780          |
| 1.7975        | 2.0031 | 2550 | 1.7794          |
| 1.7922        | 2.0424 | 2600 | 1.7733          |
| 1.7558        | 2.0817 | 2650 | 1.7721          |
| 1.7821        | 2.1210 | 2700 | 1.7694          |
| 1.7735        | 2.1603 | 2750 | 1.7644          |
| 1.7802        | 2.1995 | 2800 | 1.7630          |
| 1.7616        | 2.2388 | 2850 | 1.7603          |
| 1.7751        | 2.2781 | 2900 | 1.7580          |
| 1.7811        | 2.3174 | 2950 | 1.7550          |
| 1.7356        | 2.3566 | 3000 | 1.7529          |
| 1.7575        | 2.3959 | 3050 | 1.7514          |
| 1.7547        | 2.4352 | 3100 | 1.7510          |
| 1.7699        | 2.4745 | 3150 | 1.7522          |
| 1.7506        | 2.5137 | 3200 | 1.7496          |
| 1.7564        | 2.5530 | 3250 | 1.7441          |
| 1.7517        | 2.5923 | 3300 | 1.7436          |
| 1.7371        | 2.6316 | 3350 | 1.7433          |
| 1.7425        | 2.6709 | 3400 | 1.7430          |
| 1.7407        | 2.7101 | 3450 | 1.7402          |
| 1.7513        | 2.7494 | 3500 | 1.7408          |
| 1.7662        | 2.7887 | 3550 | 1.7384          |
| 1.7557        | 2.8280 | 3600 | 1.7397          |
| 1.7557        | 2.8672 | 3650 | 1.7405          |
| 1.753         | 2.9065 | 3700 | 1.7404          |
| 1.7788        | 2.9458 | 3750 | 1.7381          |
| 1.7539        | 2.9851 | 3800 | 1.7375          |


### Framework versions

- PEFT 0.12.0
- Transformers 4.44.0
- Pytorch 2.4.0
- Datasets 2.21.0
- Tokenizers 0.19.1