TheBloke
/

wizardLM-7B-GGML

Model card Files Files and versions Community

wizardLM-7B-GGML / README.md

TheBloke's picture

Update README.md

950c144 about 1 year ago

|

1.52 kB

	---
	license: other
	inference: false
	---

	# WizardLM: An Instruction-following LLM Using Evol-Instruct

	These files are the result of merging the [delta weights](https://huggingface.co/victor123/WizardLM) with the original Llama7B model.

	The code for merging is provided in the [WizardLM official Github repo](https://github.com/nlpxucan/WizardLM).

	## WizardLM-7B GGML

	This repo contains GGML files for WizardLM-7B for CPU inference

	## Provided files
	\| Name \| Quant method \| Bits \| Size \| RAM required \| Use case \|
	\| ---- \| ---- \| ---- \| ---- \| ---- \| ----- \|
	`WizardLM-7B.GGML.q4_0.bin` \| q4_0 \| 4bit \| 4.0GB \| 6GB \| Superseded and not recommended \|
	`WizardLM-7B.GGML.q4_2.bin` \| q4_2 \| 4bit \| 4.0GB \| 6GB \| Best compromise between resources, speed and quality \|
	`WizardLM-7B.GGML.q4_3.bin` \| q4_3 \| 4bit \| 4.8GB \| 7GB \| Maximum quality, high RAM requirements and slow inference \|

	* The q4_0 file is provided for compatibility with older versions of llama.cpp. It has been superseded and is no longer recommended.
	* The q4_2 file offers the best combination of performance and quality.
	* The q4_3 file offers the highest quality, at the cost of increased RAM usage and slower inference speed.

	# Original model info

	Overview of Evol-Instruct
	Evol-Instruct is a novel method using LLMs instead of humans to automatically mass-produce open-domain instructions of various difficulty levels and skills range, to improve the performance of LLMs.

	![info](https://github.com/nlpxucan/WizardLM/raw/main/imgs/git_running.png)