TheBloke commited on
Commit
9f2a0a4
1 Parent(s): be76671

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -13,8 +13,8 @@ It is the result of quantising to 4bit using [GPTQ-for-LLaMa](https://github.com
13
 
14
  * [4bit GPTQ models for GPU inference](https://huggingface.co/TheBloke/gpt4-x-vicuna-13B-GPTQ).
15
  * [4bit and 5bit GGML models for CPU inference](https://huggingface.co/TheBloke/gpt4-x-vicuna-13B-GGML).
16
- * [float16 models in HF format for GPU inference](https://huggingface.co/TheBloke/gpt4-x-vicuna-13B-HF).
17
-
18
  ## Provided files
19
  | Name | Quant method | Bits | Size | RAM required | Use case |
20
  | ---- | ---- | ---- | ---- | ---- | ----- |
 
13
 
14
  * [4bit GPTQ models for GPU inference](https://huggingface.co/TheBloke/gpt4-x-vicuna-13B-GPTQ).
15
  * [4bit and 5bit GGML models for CPU inference](https://huggingface.co/TheBloke/gpt4-x-vicuna-13B-GGML).
16
+ * [float16 HF model for unquantised and 8bit GPU inference](https://huggingface.co/TheBloke/gpt4-x-vicuna-13B-HF).
17
+
18
  ## Provided files
19
  | Name | Quant method | Bits | Size | RAM required | Use case |
20
  | ---- | ---- | ---- | ---- | ---- | ----- |