TheBloke
/

wizardLM-7B-GGML

Model card Files Files and versions Community

TheBloke commited on Apr 26, 2023

Commit

59371f8

·

1 Parent(s): 22cae5b

Update README.md

Files changed (1) hide show

README.md +29 -0

README.md CHANGED Viewed

@@ -1,3 +1,32 @@
 ---
 license: other
 ---

 ---
 license: other
+inference: false
 ---
+# WizardLM: An Instruction-following LLM Using Evol-Instruct
+These files are the result of merging the [delta weights](https://huggingface.co/victor123/WizardLM) with the original Llama7B model.
+The code for merging is provided in the [WizardLM official Github repo](https://github.com/nlpxucan/WizardLM).
+## WizardLM-7B GGML
+This repo contains GGML files for WizardLM-7B for CPU inference
+## Provided files
+| Name | Quant method | Bits | Size | RAM required | Use case |
+| ---- | ---- | ---- | ---- | ---- | ----- |
+`WizardLM-7B.GGML.q4_0.bin` | q4_0 | 4bit | 39GB | 41GB | Superseded and not recommended |
+`WizardLM-7B.GGML.q4_2.bin` | q4_2 | 4bit | 39GB | 41GB | Best compromise between resources, speed and quality |
+`WizardLM-7B.GGML.q4_3.bin` | q4_3 | 4bit | 47GB | 49GB | Maximum quality, high RAM requirements and slow inference |
+* The q4_0 file is provided for compatibility with older versions of llama.cpp. It has been superseded and is no longer recommended.
+* The q4_2 file offers the best combination of performance and quality.
+* The q4_3 file offers the highest quality, at the cost of increased RAM usage and slower inference speed.
+# Original model info
+Overview of Evol-Instruct
+Evol-Instruct is a novel method using LLMs instead of humans to automatically mass-produce open-domain instructions of various difficulty levels and skills range, to improve the performance of LLMs.
+![info](https://github.com/nlpxucan/WizardLM/raw/main/imgs/git_running.png)