Phi-2 Orange
A two-step finetune of Phi-2, with a bit of zest.
There is an updated model at rhysjones/phi-2-orange-v2 which has higher evals, if you wish to test.
Training details
A first finetune using a collection of broad training data:
- Open-Orca/SlimOrca-Dedup
- migtissera/Synthia-v1.3
- LDJnr/Verified-Camel
- LDJnr/Pure-Dove
- LDJnr/Capybara
- meta-math/MetaMathQA
And then a DPO finetune using:
Run within Ollama
If you're using Ollama, you can download and run using:
ollama run rhysjones/phi-2-orange
Prompt Format
Phi-2 Orange uses ChatML as the prompt format, with or without the system instruction.
To prompt with a system instruction (use whatever system prompt you like):
<|im_start|>system
You are a helpful assistant for Python which outputs in Markdown format.<|im_end|>
<|im_start|>user
Write a function to calculate the Fibonacci sequence<|im_end|>
<|im_start|>assistant
You can also omit the system prompt if you wish:
<|im_start|>user
Why is the sky blue?<|im_end|>
<|im_start|>assistant
Evaluations
Evaluations done using mlabonne's usefull Colab notebook llm-autoeval. Also check out the alternative leaderboard at Yet_Another_LLM_Leaderboard
Model | AGIEval | GPT4All | TruthfulQA | Bigbench | Average |
---|---|---|---|---|---|
phi-2-orange | 33.37 | 71.33 | 49.87 | 37.3 | 47.97 |
phi-2-dpo | 30.39 | 71.68 | 50.75 | 34.9 | 46.93 |
dolphin-2_6-phi-2 | 33.12 | 69.85 | 47.39 | 37.2 | 46.89 |
phi-2 | 27.98 | 70.8 | 44.43 | 35.21 | 44.61 |
- Downloads last month
- 112
This model does not have enough activity to be deployed to Inference API (serverless) yet.
Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.