neuralphi-2 / README.md
xz56's picture
Update README.md
38bbf30 verified
metadata
license: apache-2.0
datasets:
  - Intel/orca_dpo_pairs

Model Summary

Neuralphi-2 is an experiment in DPO finetuning. It was made following Max Labonne's excellent article about fine-tuning mistral-7b. Neuralphi-2 is phi-2-sft finetuned using DPO with Intel/orca_dpo_pairs.

Prompt Format

"""### Human: {instruction}

### Assistant:"""