Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,7 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
Quant of https://huggingface.co/TheBloke/vicuna-13B-1.1-HF
|
2 |
+
|
3 |
+
There's already one located at https://huggingface.co/TheBloke/vicuna-13B-1.1-GPTQ-4bit-128g, but neither version they uploaded works with certain older versions of GPTQ-for-LLaMA (such as 0cc4m's fork that is used with their fork of KoboldAI).
|
4 |
+
|
5 |
+
This was quantized with 0cc4m's fork of GPTQ-for-LLaMA.
|
6 |
+
|
7 |
+
```python llama.py ./vicuna-13B-1.1-HF c4 --wbits 4 --true-sequential --groupsize 128 --save_safetensors 4bit-128g.safetensors```
|