Weyaxi_-_Draco-8x7B-gguf / README.md

RichardErkhov

uploaded readme

2e7ca99 verified about 2 months ago

preview code

raw

history blame contribute delete

No virus

12.5 kB

	Quantization made by Richard Erkhov.

	[Github](https://github.com/RichardErkhov)

	[Discord](https://discord.gg/pvy7H8DZMG)

	[Request more models](https://github.com/RichardErkhov/quant_request)


	Draco-8x7B - GGUF
	- Model creator: https://huggingface.co/Weyaxi/
	- Original model: https://huggingface.co/Weyaxi/Draco-8x7B/


	\| Name \| Quant method \| Size \|
	\| ---- \| ---- \| ---- \|
	\| [Draco-8x7B.Q2_K.gguf](https://huggingface.co/RichardErkhov/Weyaxi_-_Draco-8x7B-gguf/blob/main/Draco-8x7B.Q2_K.gguf) \| Q2_K \| 16.12GB \|
	\| [Draco-8x7B.IQ3_XS.gguf](https://huggingface.co/RichardErkhov/Weyaxi_-_Draco-8x7B-gguf/blob/main/Draco-8x7B.IQ3_XS.gguf) \| IQ3_XS \| 18.02GB \|
	\| [Draco-8x7B.IQ3_S.gguf](https://huggingface.co/RichardErkhov/Weyaxi_-_Draco-8x7B-gguf/blob/main/Draco-8x7B.IQ3_S.gguf) \| IQ3_S \| 19.03GB \|
	\| [Draco-8x7B.Q3_K_S.gguf](https://huggingface.co/RichardErkhov/Weyaxi_-_Draco-8x7B-gguf/blob/main/Draco-8x7B.Q3_K_S.gguf) \| Q3_K_S \| 19.03GB \|
	\| [Draco-8x7B.IQ3_M.gguf](https://huggingface.co/RichardErkhov/Weyaxi_-_Draco-8x7B-gguf/blob/main/Draco-8x7B.IQ3_M.gguf) \| IQ3_M \| 19.96GB \|
	\| [Draco-8x7B.Q3_K.gguf](https://huggingface.co/RichardErkhov/Weyaxi_-_Draco-8x7B-gguf/blob/main/Draco-8x7B.Q3_K.gguf) \| Q3_K \| 21.0GB \|
	\| [Draco-8x7B.Q3_K_M.gguf](https://huggingface.co/RichardErkhov/Weyaxi_-_Draco-8x7B-gguf/blob/main/Draco-8x7B.Q3_K_M.gguf) \| Q3_K_M \| 21.0GB \|
	\| [Draco-8x7B.Q3_K_L.gguf](https://huggingface.co/RichardErkhov/Weyaxi_-_Draco-8x7B-gguf/blob/main/Draco-8x7B.Q3_K_L.gguf) \| Q3_K_L \| 22.51GB \|
	\| [Draco-8x7B.IQ4_XS.gguf](https://huggingface.co/RichardErkhov/Weyaxi_-_Draco-8x7B-gguf/blob/main/Draco-8x7B.IQ4_XS.gguf) \| IQ4_XS \| 23.63GB \|
	\| [Draco-8x7B.Q4_0.gguf](https://huggingface.co/RichardErkhov/Weyaxi_-_Draco-8x7B-gguf/blob/main/Draco-8x7B.Q4_0.gguf) \| Q4_0 \| 24.63GB \|
	\| [Draco-8x7B.IQ4_NL.gguf](https://huggingface.co/RichardErkhov/Weyaxi_-_Draco-8x7B-gguf/blob/main/Draco-8x7B.IQ4_NL.gguf) \| IQ4_NL \| 24.91GB \|
	\| [Draco-8x7B.Q4_K_S.gguf](https://huggingface.co/RichardErkhov/Weyaxi_-_Draco-8x7B-gguf/blob/main/Draco-8x7B.Q4_K_S.gguf) \| Q4_K_S \| 24.91GB \|
	\| [Draco-8x7B.Q4_K.gguf](https://huggingface.co/RichardErkhov/Weyaxi_-_Draco-8x7B-gguf/blob/main/Draco-8x7B.Q4_K.gguf) \| Q4_K \| 26.49GB \|
	\| [Draco-8x7B.Q4_K_M.gguf](https://huggingface.co/RichardErkhov/Weyaxi_-_Draco-8x7B-gguf/blob/main/Draco-8x7B.Q4_K_M.gguf) \| Q4_K_M \| 26.49GB \|
	\| [Draco-8x7B.Q4_1.gguf](https://huggingface.co/RichardErkhov/Weyaxi_-_Draco-8x7B-gguf/blob/main/Draco-8x7B.Q4_1.gguf) \| Q4_1 \| 27.32GB \|
	\| [Draco-8x7B.Q5_0.gguf](https://huggingface.co/RichardErkhov/Weyaxi_-_Draco-8x7B-gguf/blob/main/Draco-8x7B.Q5_0.gguf) \| Q5_0 \| 30.02GB \|
	\| [Draco-8x7B.Q5_K_S.gguf](https://huggingface.co/RichardErkhov/Weyaxi_-_Draco-8x7B-gguf/blob/main/Draco-8x7B.Q5_K_S.gguf) \| Q5_K_S \| 30.02GB \|
	\| [Draco-8x7B.Q5_K.gguf](https://huggingface.co/RichardErkhov/Weyaxi_-_Draco-8x7B-gguf/blob/main/Draco-8x7B.Q5_K.gguf) \| Q5_K \| 30.95GB \|
	\| [Draco-8x7B.Q5_K_M.gguf](https://huggingface.co/RichardErkhov/Weyaxi_-_Draco-8x7B-gguf/blob/main/Draco-8x7B.Q5_K_M.gguf) \| Q5_K_M \| 30.95GB \|
	\| [Draco-8x7B.Q5_1.gguf](https://huggingface.co/RichardErkhov/Weyaxi_-_Draco-8x7B-gguf/blob/main/Draco-8x7B.Q5_1.gguf) \| Q5_1 \| 32.71GB \|
	\| [Draco-8x7B.Q6_K.gguf](https://huggingface.co/RichardErkhov/Weyaxi_-_Draco-8x7B-gguf/blob/main/Draco-8x7B.Q6_K.gguf) \| Q6_K \| 35.74GB \|
	\| [Draco-8x7B.Q8_0.gguf](https://huggingface.co/RichardErkhov/Weyaxi_-_Draco-8x7B-gguf/tree/main/) \| Q8_0 \| 46.22GB \|




	Original model description:
	---
	license: apache-2.0
	tags:
	- moe
	- openchat
	- hermes
	- dolphin
	- bagel
	model-index:
	- name: Draco-8x7B
	results:
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: AI2 Reasoning Challenge (25-Shot)
	type: ai2_arc
	config: ARC-Challenge
	split: test
	args:
	num_few_shot: 25
	metrics:
	- type: acc_norm
	value: 65.02
	name: normalized accuracy
	source:
	url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=PulsarAI/Draco-8x7B
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: HellaSwag (10-Shot)
	type: hellaswag
	split: validation
	args:
	num_few_shot: 10
	metrics:
	- type: acc_norm
	value: 85.24
	name: normalized accuracy
	source:
	url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=PulsarAI/Draco-8x7B
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: MMLU (5-Shot)
	type: cais/mmlu
	config: all
	split: test
	args:
	num_few_shot: 5
	metrics:
	- type: acc
	value: 64.96
	name: accuracy
	source:
	url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=PulsarAI/Draco-8x7B
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: TruthfulQA (0-shot)
	type: truthful_qa
	config: multiple_choice
	split: validation
	args:
	num_few_shot: 0
	metrics:
	- type: mc2
	value: 62.65
	source:
	url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=PulsarAI/Draco-8x7B
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: Winogrande (5-shot)
	type: winogrande
	config: winogrande_xl
	split: validation
	args:
	num_few_shot: 5
	metrics:
	- type: acc
	value: 80.66
	name: accuracy
	source:
	url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=PulsarAI/Draco-8x7B
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: GSM8k (5-shot)
	type: gsm8k
	config: main
	split: test
	args:
	num_few_shot: 5
	metrics:
	- type: acc
	value: 66.79
	name: accuracy
	source:
	url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=PulsarAI/Draco-8x7B
	name: Open LLM Leaderboard
	---
	![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/6468ce47e134d050a58aa89c/VWIJplnya5L7wmGxK4Lut.jpeg)

	# 💫 Draco-8x7B

	This is the model for Draco-8x7B. I used [this repo](https://bit.ly/weyaxi-moe-repo) to make this MOE model.

	This model's experts are not using any merged models.

	# 📚 Other branches (Number of Experts Per Token)

	Other branches that this repository contains differ only slightly (from a git diff perspective) in terms of the number of experts per token.

	Usually, a higher value for the number of experts per token will result in better performance, but it may also lead to increased inference time.

	\| Number of experts per token \| Link of the branch \|
	\| ---------------------------- \| -------------------------------------------------------------------------------------------\|
	\| 2 \| [Main](https://huggingface.co/Weyaxi/Draco-8x7B/tree/main) \|
	\| 3 \| [3-experts-per-token](https://huggingface.co/Weyaxi/Draco-8x7B/tree/3-experts-per-token) \|
	\| 4 \| [4-experts-per-token](https://huggingface.co/Weyaxi/Draco-8x7B/tree/4-experts-per-token) \|
	\| 6 \| [6-experts-per-token](https://huggingface.co/Weyaxi/Draco-8x7B/tree/6-experts-per-token) \|
	\| 8 \| [8-experts-per-token](https://huggingface.co/Weyaxi/Draco-8x7B/tree/8-experts-per-token) \|


	# 💬 Prompt Template(s):

	This model includes many models, so providing only one prompt template is not enough. You can use and try these prompt templates and decide which works best for you.

	Note: The current chat template in the tokenizer config is set to [openchat-3.5-0106](https://huggingface.co/openchat/openchat-3.5-0106)'s chat template.

	Note 2: It is also important to note that [jondurbin/bagel-dpo-7b-v0.1](https://huggingface.co/jondurbin/bagel-dpo-7b-v0.1) is using many prompt templates other than I provided. You can visit [jondurbin/bagel-dpo-7b-v0.1](https://huggingface.co/jondurbin/bagel-dpo-7b-v0.1) to learn more about this templates.

	### GPT4 Correct

	Used in [openchat/openchat-3.5-0106](https://huggingface.co/openchat/openchat-3.5-0106), [beowolx/CodeNinja-1.0-OpenChat-7B](https://huggingface.co/beowolx/CodeNinja-1.0-OpenChat-7B)

	```
	GPT4 Correct User: {user}<\|end_of_turn\|>GPT4 Correct Assistant: {asistant}<\|end_of_turn\|>
	```

	### ChatML:

	Used in [teknium/OpenHermes-2.5-Mistral-7B](https://huggingface.co/teknium/OpenHermes-2.5-Mistral-7B), [jondurbin/bagel-dpo-7b-v0.1](https://huggingface.co/jondurbin/bagel-dpo-7b-v0.1), [cognitivecomputations/dolphin-2.6-mistral-7b-dpo-laser](https://huggingface.co/cognitivecomputations/dolphin-2.6-mistral-7b-dpo-laser), [senseable/WestLake-7B-v2](https://huggingface.co/senseable/WestLake-7B-v2)

	```
	<\|im_start\|>system
	{system}<\|im_end\|>
	<\|im_start\|>user
	{user}<\|im_end\|>
	<\|im_start\|>assistant
	{asistant}<\|im_end\|>
	```

	### Math Alpaca

	Used in [meta-math/MetaMath-Mistral-7B](https://huggingface.co/meta-math/MetaMath-Mistral-7B)

	```
	Below is an instruction that describes a task. Write a response that appropriately completes the request.

	### Instruction:
	{instruction}

	### Response: Let's think step by step.
	```

	# 🛠️ Yaml Config

	<details><summary>See config</summary>

	```yaml
	base_model: openchat/openchat-3.5-0106
	gate_mode: hidden
	dtype: bfloat16

	experts:
	- source_model: openchat/openchat-3.5-0106
	positive_prompts: # General (Mistral finetune)
	- "chat"
	- "assistant"
	- "tell me"
	- "explain"

	- source_model: teknium/OpenHermes-2.5-Mistral-7B
	positive_prompts: # General (Mistral finetune)
	- "interact"
	- "converse"
	- "respond"
	- "express"

	- source_model: jondurbin/bagel-dpo-7b-v0.1
	positive_prompts: # Science (Mistral finetune)
	- "science"
	- "biology"
	- "chemistry"
	- "physics"
	- "Newton's laws"
	- "scientific method"
	- "periodic table"
	- "photosynthesis process"

	- source_model: meta-math/MetaMath-Mistral-7B
	positive_prompts: # Math (Mistral finetune)
	- "reason"
	- "math"
	- "mathematics"
	- "solve"
	- "count"

	- source_model: cognitivecomputations/dolphin-2.6-mistral-7b-dpo-laser
	positive_prompts: # Uncensored (Mistral finetune)
	- "dolphin"
	- "uncensored"
	- "unbiased"
	- "unfiltered"
	- "unrestricted"
	- "offensive"

	- source_model: beowolx/CodeNinja-1.0-OpenChat-7B
	positive_prompts: # Code (openchat-3.5-1210 finetune)
	- "code"
	- "script"
	- "python"
	- "javascript"
	- "programming"
	- "algorithm"

	- source_model: senseable/WestLake-7B-v2
	positive_prompts: # Roleplay (Unknown finetune)
	- "storywriting"
	- "write"
	- "scene"
	- "story"
	- "character"
	- "act as"
	- "you are"

	- source_model: snorkelai/Snorkel-Mistral-PairRM-DPO
	positive_prompts: # Question Answering (? Mistral-7B-Instruct-v0.2 finetune ?)
	- "what happens"
	- "what is"
	- "what can"
	- "why"
	- "who"
	- "can a"
	```

	</details><br>

	# 🔄 Quantizationed versions

	Quantizationed versions of this model is available thanks to [TheBloke](https://hf.co/TheBloke).

	##### GPTQ

	- [TheBloke/Draco-8x7B-GPTQ](https://huggingface.co/TheBloke/Draco-8x7B-GPTQ)

	##### GGUF

	- [TheBloke/Draco-8x7B-GGUF](https://huggingface.co/TheBloke/Draco-8x7B-GGUF)

	##### AWQ

	- [TheBloke/Draco-8x7B-AWQ](https://huggingface.co/TheBloke/Draco-8x7B-AWQ)

	If you would like to support me:

	[☕ Buy Me a Coffee](https://www.buymeacoffee.com/weyaxi)
	# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
	Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_PulsarAI__Draco-8x7B)

	\| Metric \|Value\|
	\|---------------------------------\|----:\|
	\|Avg. \|70.89\|
	\|AI2 Reasoning Challenge (25-Shot)\|65.02\|
	\|HellaSwag (10-Shot) \|85.24\|
	\|MMLU (5-Shot) \|64.96\|
	\|TruthfulQA (0-shot) \|62.65\|
	\|Winogrande (5-shot) \|80.66\|
	\|GSM8k (5-shot) \|66.79\|