File size: 970 Bytes
6c1ef92
 
 
 
989936b
 
 
 
 
 
 
6c1ef92
6b1a7d8
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
---
license: other
license_name: phi-3
license_link: https://huggingface.co/microsoft/Phi-3-mini-128k-instruct/raw/main/LICENSE
datasets:
- m-a-p/CodeFeedback-Filtered-Instruction
tags:
- phi
- phi-3
- '3'
- code
---
Finetune of [microsoft/Phi-3-mini-128k-instruct](https://huggingface.co/microsoft/Phi-3-mini-128k-instruct) on [m-a-p/CodeFeedback-Filtered-Instruction](https://huggingface.co/datasets/m-a-p/CodeFeedback-Filtered-Instruction) for ~9-10h using a single 3090 24GB. 

Due to limited resources and time, the training was only on half (0.5136) of the epoch.

```
  train_loss: 0.43311
```

```
    learning_rate=5e-5,
    lr_scheduler_type="cosine",
    max_length=1024,
    max_prompt_length=512,
    overwrite_output_dir=True,
    beta=0.1,
    gradient_accumulation_steps=8,
    optim="adamw_torch",
    num_train_epochs=1,
    evaluation_strategy="steps",
    eval_steps=0.2,
    logging_steps=1,
    warmup_steps=50,
    fp16=True,
    save_steps=50
```