phi-2-orange-v2 / README.md
rhysjones's picture
Upload folder using huggingface_hub
0b22f76 verified
|
raw
history blame
No virus
1.71 kB
---
license: mit
datasets:
- Open-Orca/SlimOrca-Dedup
- migtissera/Synthia-v1.3
- LDJnr/Verified-Camel
- LDJnr/Pure-Dove
- LDJnr/Capybara
- meta-math/MetaMathQA
- Intel/orca_dpo_pairs
- argilla/ultrafeedback-binarized-preferences-cleaned
---
![Phi-2 Orange](https://huggingface.co/rhysjones/phi-2-orange-v2/resolve/main/phi-2-orange.jpg)
# Phi-2 Orange Version 2
A two-step finetune of Phi-2, with a bit more zest.
First using a collection of broad training data:
- [Open-Orca/SlimOrca-Dedup](https://huggingface.co/datasets/Open-Orca/SlimOrca-Dedup)
- [migtissera/Synthia-v1.3](https://huggingface.co/datasets/migtissera/Synthia-v1.3)
- [LDJnr/Verified-Camel](https://huggingface.co/datasets/LDJnr/Verified-Camel)
- [LDJnr/Pure-Dove](https://huggingface.co/datasets/LDJnr/Pure-Dove)
- [LDJnr/Capybara](https://huggingface.co/datasets/LDJnr/Capybara)
- [meta-math/MetaMathQA](https://huggingface.co/datasets/meta-math/MetaMathQA)
And then a DPO finetune using:
- [Intel/orca_dpo_pairs](https://huggingface.co/datasets/Intel/orca_dpo_pairs)
- [argilla/ultrafeedback-binarized-preferences-cleaned](https://huggingface.co/datasets/argilla/ultrafeedback-binarized-preferences-cleaned)
# Prompt Format
Phi-2 Orange uses ChatML as the prompt format, with or without the system instruction.
To prompt with a system instruction (use whatever system prompt you like):
```
<|im_start|>system
You are a helpful assistant for Python which outputs in Markdown format.<|im_end|>
<|im_start|>user
Write a function to calculate the Fibonacci sequence<|im_end|>
<|im_start|>assistant
```
You can also omit the system prompt if you wish:
```
<|im_start|>user
Why is the sky blue?<|im_end|>
<|im_start|>assistant
```