wizardLM-7B-GGML / README.md
TheBloke's picture
Update README.md
950c144
|
raw
history blame
1.52 kB
metadata
license: other
inference: false

WizardLM: An Instruction-following LLM Using Evol-Instruct

These files are the result of merging the delta weights with the original Llama7B model.

The code for merging is provided in the WizardLM official Github repo.

WizardLM-7B GGML

This repo contains GGML files for WizardLM-7B for CPU inference

Provided files

Name Quant method Bits Size RAM required Use case
WizardLM-7B.GGML.q4_0.bin q4_0 4bit 4.0GB 6GB Superseded and not recommended
WizardLM-7B.GGML.q4_2.bin q4_2 4bit 4.0GB 6GB Best compromise between resources, speed and quality
WizardLM-7B.GGML.q4_3.bin q4_3 4bit 4.8GB 7GB Maximum quality, high RAM requirements and slow inference
  • The q4_0 file is provided for compatibility with older versions of llama.cpp. It has been superseded and is no longer recommended.
  • The q4_2 file offers the best combination of performance and quality.
  • The q4_3 file offers the highest quality, at the cost of increased RAM usage and slower inference speed.

Original model info

Overview of Evol-Instruct Evol-Instruct is a novel method using LLMs instead of humans to automatically mass-produce open-domain instructions of various difficulty levels and skills range, to improve the performance of LLMs.

info