MaziyarPanahi
/

NeuralHermes-2.5-Mistral-7B-Mistral-7B-Instruct-v0.2-slerp

Text Generation

mistralai/Mistral-7B-Instruct-v0.2

mlabonne/NeuralHermes-2.5-Mistral-7B

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

NeuralHermes-2.5-Mistral-7B-Mistral-7B-Instruct-v0.2-slerp / README.md

MaziyarPanahi's picture

Update README.md

06560fc verified 11 months ago

|

history blame contribute delete

3.33 kB

	---
	license: apache-2.0
	tags:
	- merge
	- mergekit
	- mistral
	- 7b
	- lazymergekit
	- mistralai/Mistral-7B-Instruct-v0.2
	- mlabonne/NeuralHermes-2.5-Mistral-7B
	---

	# NeuralHermes-2.5-Mistral-7B-Mistral-7B-Instruct-v0.2-slerp

	NeuralHermes-2.5-Mistral-7B-Mistral-7B-Instruct-v0.2-slerp is a merge of the following models:
	* [mistralai/Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2)
	* [mlabonne/NeuralHermes-2.5-Mistral-7B](https://huggingface.co/mlabonne/NeuralHermes-2.5-Mistral-7B)

	## Eval

	```
	\| Groups \|Version\|Filter\|n-shot\| Metric \| Value \| \|Stderr\|
	\|------------------\|-------\|------\|-----:\|-----------\|------:\|---\|-----:\|
	\|ai2_arc \|N/A \|none \| 0\|acc \| 0.7508\|± \|0.0419\|
	\| \| \|none \| 0\|acc_norm \| 0.7393\|± \|0.0354\|
	\|mmlu \|N/A \|none \| 0\|acc \| 0.6082\|± \|0.1381\|
	\| - humanities \|N/A \|none \| 0\|acc \| 0.5545\|± \|0.1585\|
	\| - other \|N/A \|none \| 0\|acc \| 0.6823\|± \|0.1122\|
	\| - social_sciences\|N/A \|none \| 0\|acc \| 0.7062\|± \|0.0825\|
	\| - stem \|N/A \|none \| 0\|acc \| 0.5195\|± \|0.1231\|
	\|truthfulqa \|N/A \|none \| 0\|acc \| 0.5058\|± \|0.0023\|
	\| \| \|none \| 0\|bleu_max \|25.2659\|± \|0.7944\|
	\| \| \|none \| 0\|bleu_acc \| 0.5557\|± \|0.0174\|
	\| \| \|none \| 0\|bleu_diff \| 4.5134\|± \|0.7505\|
	\| \| \|none \| 0\|rouge1_max \|51.5877\|± \|0.8677\|
	\| \| \|none \| 0\|rouge1_acc \| 0.5496\|± \|0.0174\|
	\| \| \|none \| 0\|rouge1_diff\| 6.8850\|± \|1.0155\|
	\| \| \|none \| 0\|rouge2_max \|36.0848\|± \|1.0385\|
	\| \| \|none \| 0\|rouge2_acc \| 0.4700\|± \|0.0175\|
	\| \| \|none \| 0\|rouge2_diff\| 5.8893\|± \|1.1296\|
	\| \| \|none \| 0\|rougeL_max \|48.4591\|± \|0.8901\|
	\| \| \|none \| 0\|rougeL_acc \| 0.5496\|± \|0.0174\|
	\| \| \|none \| 0\|rougeL_diff\| 6.5791\|± \|1.0249\|
	```
	## 🧩 Configuration

	```yaml
	slices:
	- sources:
	- model: mistralai/Mistral-7B-Instruct-v0.2
	layer_range: [0, 32]
	- model: mlabonne/NeuralHermes-2.5-Mistral-7B
	layer_range: [0, 32]
	merge_method: slerp
	base_model: mistralai/Mistral-7B-Instruct-v0.2
	parameters:
	t:
	- filter: self_attn
	value: [0, 0.5, 0.3, 0.7, 1]
	- filter: mlp
	value: [1, 0.5, 0.7, 0.3, 0]
	- value: 0.5
	dtype: bfloat16
	```


	## 💻 Usage


	```python
	!pip install -qU transformers accelerate

	from transformers import AutoTokenizer
	import transformers
	import torch

	model = "MaziyarPanahi/NeuralHermes-2.5-Mistral-7B-Mistral-7B-Instruct-v0.2-slerp"
	messages = [{"role": "user", "content": "What is a large language model?"}]

	tokenizer = AutoTokenizer.from_pretrained(model)
	prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
	pipeline = transformers.pipeline(
	"text-generation",
	model=model,
	torch_dtype=torch.float16,
	device_map="auto",
	)

	outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
	print(outputs[0]["generated_text"])
	```