File size: 1,714 Bytes
f064bbe
 
0b22f76
 
 
 
 
 
 
 
 
f064bbe
0b22f76
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
---
license: mit
datasets:
- Open-Orca/SlimOrca-Dedup
- migtissera/Synthia-v1.3
- LDJnr/Verified-Camel
- LDJnr/Pure-Dove
- LDJnr/Capybara
- meta-math/MetaMathQA
- Intel/orca_dpo_pairs
- argilla/ultrafeedback-binarized-preferences-cleaned
---
![Phi-2 Orange](https://huggingface.co/rhysjones/phi-2-orange-v2/resolve/main/phi-2-orange.jpg)

# Phi-2 Orange Version 2

A two-step finetune of Phi-2, with a bit more zest.

First using a collection of broad training data:

- [Open-Orca/SlimOrca-Dedup](https://huggingface.co/datasets/Open-Orca/SlimOrca-Dedup)
- [migtissera/Synthia-v1.3](https://huggingface.co/datasets/migtissera/Synthia-v1.3)
- [LDJnr/Verified-Camel](https://huggingface.co/datasets/LDJnr/Verified-Camel)
- [LDJnr/Pure-Dove](https://huggingface.co/datasets/LDJnr/Pure-Dove)
- [LDJnr/Capybara](https://huggingface.co/datasets/LDJnr/Capybara)
- [meta-math/MetaMathQA](https://huggingface.co/datasets/meta-math/MetaMathQA)

And then a DPO finetune using:

- [Intel/orca_dpo_pairs](https://huggingface.co/datasets/Intel/orca_dpo_pairs)
- [argilla/ultrafeedback-binarized-preferences-cleaned](https://huggingface.co/datasets/argilla/ultrafeedback-binarized-preferences-cleaned)


# Prompt Format

Phi-2 Orange uses ChatML as the prompt format, with or without the system instruction.

To prompt with a system instruction (use whatever system prompt you like):

```
<|im_start|>system
You are a helpful assistant for Python which outputs in Markdown format.<|im_end|>
<|im_start|>user
Write a function to calculate the Fibonacci sequence<|im_end|>
<|im_start|>assistant

```

You can also omit the system prompt if you wish:

```
<|im_start|>user
Why is the sky blue?<|im_end|>
<|im_start|>assistant

```