metadata
pipeline_tag: conversational
tags:
- vicuna
- llama
- text-generation-inference
Converted for use with llama.cpp
- Based on AlekseyKorshuk/vicuna-7b
- Uncensored
- 4-bit quantized
- Needs ~6GB of CPU RAM
- Won't work with alpaca.cpp or old llama.cpp (new ggml format requires latest llama.cpp)
- 7B parameter version
- EOS token fix on revision 1
Bigger 13B version can be found here: https://huggingface.co/eachadea/ggml-vicuna-13b-4bit.
Unlike 7B, 13B is censored.