--- license: mit --- # Phi-2 Orange A two-step finetune of Phi-2. First using a collection of broad training data: - [Open-Orca/SlimOrca-Dedup](https://huggingface.co/datasets/Open-Orca/SlimOrca-Dedup) - [migtissera/Synthia-v1.3](https://huggingface.co/datasets/migtissera/Synthia-v1.3) - [LDJnr/Verified-Camel](https://huggingface.co/datasets/LDJnr/Verified-Camel) - [LDJnr/Pure-Dove](https://huggingface.co/datasets/LDJnr/Pure-Dove) - [LDJnr/Capybara](https://huggingface.co/datasets/LDJnr/Capybara) - [meta-math/MetaMathQA](https://huggingface.co/datasets/meta-math/MetaMathQA) And then a DPO finetune using: - [Intel/orca_dpo_pairs](https://huggingface.co/datasets/Intel/orca_dpo_pairs) - [argilla/ultrafeedback-binarized-preferences-cleaned](https://huggingface.co/datasets/argilla/ultrafeedback-binarized-preferences-cleaned) # Initial Evals - ARC: 62.29 - TruthfulQA: 49.85