--- pipeline_tag: conversational tags: - vicuna - llama - text-generation-inference --- Converted for use with [llama.cpp](https://github.com/ggerganov/llama.cpp) --- - Based on AlekseyKorshuk/vicuna-7b - 4-bit quantized - Needs ~6GB of CPU RAM - Won't work with alpaca.cpp or old llama.cpp (new ggml format requires latest llama.cpp) - 7B parameter version --- Bigger 13B version can be found here: https://huggingface.co/eachadea/ggml-vicuna-13b-4bit --- tags: - vicuna - llama - text-generation-inference ---