TheBloke
/

vicuna-7B-1.1-GPTQ

Text Generation

text-generation-inference

Model card Files Files and versions Community

TheBloke commited on Apr 13, 2023

Commit

21843a4

•

1 Parent(s): f52c97f

Update README.md

Files changed (1) hide show

README.md +15 -0

README.md CHANGED Viewed

@@ -10,6 +10,21 @@ It was created by merging the deltas provided in the above repo with the origina
 It was then quantized to 4bit using [GPTQ-for-LLaMa](https://github.com/qwopqwop200/GPTQ-for-LLaMa).
 ## Provided files
 Two model files are provided. Ideally use the `safetensors` file. Full details below:

 It was then quantized to 4bit using [GPTQ-for-LLaMa](https://github.com/qwopqwop200/GPTQ-for-LLaMa).
+## My Vicuna 1.1 model repositories
+I have the following Vicuna 1.1 repositories available:
+**13B models:**
+* [Unquantized 13B 1.1 model for GPU - HF format](https://huggingface.co/TheBloke/vicuna-13B-1.1-HF)
+* [GPTQ quantized 4bit 13B 1.1 for GPU - `safetensors` and `pt` formats](https://huggingface.co/TheBloke/vicuna-13B-1.1-GPTQ-4bit-128g)
+* [GPTQ quantized 4bit 13B 1.1 for CPU - GGML format for `llama.cpp`](https://huggingface.co/TheBloke/vicuna-13B-1.1-GPTQ-4bit-128g-GGML)
+**7B models:**
+* [Unquantized 7B 1.1 model for GPU - HF format](https://huggingface.co/TheBloke/vicuna-7B-1.1-HF)
+* [GPTQ quantized 4bit 7B 1.1 for GPU - `safetensors` and `pt` formats](https://huggingface.co/TheBloke/vicuna-7B-1.1-GPTQ-4bit-128g)
+* [GPTQ quantized 4bit 7B 1.1 for CPU - GGML format for `llama.cpp`](https://huggingface.co/TheBloke/vicuna-7B-1.1-GPTQ-4bit-128g-GGML)
 ## Provided files
 Two model files are provided. Ideally use the `safetensors` file. Full details below: