phi-2-orange-v2 / README.md
rhysjones's picture
Upload folder using huggingface_hub
0b22f76 verified
|
raw
history blame
No virus
1.71 kB
metadata
license: mit
datasets:
  - Open-Orca/SlimOrca-Dedup
  - migtissera/Synthia-v1.3
  - LDJnr/Verified-Camel
  - LDJnr/Pure-Dove
  - LDJnr/Capybara
  - meta-math/MetaMathQA
  - Intel/orca_dpo_pairs
  - argilla/ultrafeedback-binarized-preferences-cleaned

Phi-2 Orange

Phi-2 Orange Version 2

A two-step finetune of Phi-2, with a bit more zest.

First using a collection of broad training data:

And then a DPO finetune using:

Prompt Format

Phi-2 Orange uses ChatML as the prompt format, with or without the system instruction.

To prompt with a system instruction (use whatever system prompt you like):

<|im_start|>system
You are a helpful assistant for Python which outputs in Markdown format.<|im_end|>
<|im_start|>user
Write a function to calculate the Fibonacci sequence<|im_end|>
<|im_start|>assistant

You can also omit the system prompt if you wish:

<|im_start|>user
Why is the sky blue?<|im_end|>
<|im_start|>assistant