---
license: mit
---

# Phi-2 Orange

A two-step finetune of Phi-2.

First using a collection of broad training data:

- [Open-Orca/SlimOrca-Dedup](https://huggingface.co/datasets/Open-Orca/SlimOrca-Dedup)
- [migtissera/Synthia-v1.3](https://huggingface.co/datasets/migtissera/Synthia-v1.3)
- [LDJnr/Verified-Camel](https://huggingface.co/datasets/LDJnr/Verified-Camel)
- [LDJnr/Pure-Dove](https://huggingface.co/datasets/LDJnr/Pure-Dove)
- [LDJnr/Capybara](https://huggingface.co/datasets/LDJnr/Capybara)
- [meta-math/MetaMathQA](https://huggingface.co/datasets/meta-math/MetaMathQA)

And then a DPO finetune using:

- [Intel/orca_dpo_pairs](https://huggingface.co/datasets/Intel/orca_dpo_pairs)
- [argilla/ultrafeedback-binarized-preferences-cleaned](https://huggingface.co/datasets/argilla/ultrafeedback-binarized-preferences-cleaned)

# Initial Evals

- ARC: 62.29
- TruthfulQA: 49.85