shahzebnaveed
/

NeuralHermes-2.5-Mistral-7B

Text Generation

Inference Endpoints

text-generation-inference

Model card Files Files and versions Community

NeuralHermes-2.5-Mistral-7B / README.md

shahzebnaveed's picture

Update README.md

c5026f2 verified 4 months ago

|

raw history blame contribute delete

No virus

1.5 kB

	---
	library_name: transformers
	license: apache-2.0
	---

	# Model Card for NeuralHermes 2.5 - Mistral 7B


	NeuralHermes is based on the teknium/OpenHermes-2.5-Mistral-7B model that has been further fine-tuned with Direct Preference Optimization (DPO) using the Intel/orca_dpo_pairs dataset, reformatted with the ChatML template.

	It is directly inspired by the RLHF process described by Intel/neural-chat-7b-v3-1's authors to improve performance.


	IMPORTANT

	- This model was only run for 2 steps before GPU went out of memory. Hence, this is not completely fine-tuned with DPO.
	- Secondly, to make it run over a small GPU, I purposefully reduced the parameters (# of LORA adapters, alpha, etc.). The values are therefore not the ideal.



	## Uses

	You can use the following code to use this model:


	import transformers
	from transformers import AutoTokenizer

	# Format prompt
	message = [
	{"role": "system", "content": "You are a helpful assistant chatbot."},
	{"role": "user", "content": "What is a Large Language Model?"}
	]
	tokenizer = AutoTokenizer.from_pretrained(new_model)
	prompt = tokenizer.apply_chat_template(message, add_generation_prompt=True, tokenize=False)

	# Create pipeline
	pipeline = transformers.pipeline(
	"text-generation",
	model=new_model,
	tokenizer=tokenizer
	)

	# Generate text
	sequences = pipeline(
	prompt,
	do_sample=True,
	temperature=0.7,
	top_p=0.9,
	num_return_sequences=1,
	max_length=200,
	)
	print(sequences[0]['generated_text'])