xz56
/

neuralphi-2

Text Generation

Inference Endpoints

Model card Files Files and versions Community

neuralphi-2 / README.md

xz56's picture

Update README.md

38bbf30 verified 6 months ago

|

raw history blame contribute delete

No virus

566 Bytes

	---
	license: apache-2.0
	datasets:
	- Intel/orca_dpo_pairs
	---
	# Model Summary
	Neuralphi-2 is an experiment in DPO finetuning. It was made following Max Labonne's excellent [article](https://towardsdatascience.com/fine-tune-a-mistral-7b-model-with-direct-preference-optimization-708042745aac) about fine-tuning mistral-7b.
	Neuralphi-2 is [phi-2-sft](https://huggingface.co/lxuechen/phi-2-sft) finetuned using DPO with [Intel/orca_dpo_pairs](https://huggingface.co/datasets/Intel/orca_dpo_pairs).
	# Prompt Format
	```
	"""### Human: {instruction}

	### Assistant:"""
	```