Text Generation
Transformers
PyTorch
English
llama
causal-lm
Inference Endpoints
text-generation-inference
TheBloke commited on
Commit
b8ce326
1 Parent(s): 8f3a676

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -13,15 +13,15 @@ inference: true
13
  ---
14
  # StableVicuna-13B
15
 
16
- This is an HF format unquantised model of [CarterAI's StableVicuna 13B](https://huggingface.co/CarperAI/stable-vicuna-13b-delta).
17
 
18
  It is the result of merging the deltas from the above repository with the original Llama 13B weights.
19
 
20
  ## Repositories available
21
 
22
  * [4bit GPTQ models for GPU inference](https://huggingface.co/TheBloke/stable-vicuna-13B-GPTQ).
23
- * [4bit and 5bit GGML models for CPU inference](https://huggingface.co/TheBloke/stable-vicuna-13B-GGML).
24
- * [Unquantised 16bit model in HF format](https://huggingface.co/TheBloke/stable-vicuna-13B-HF).
25
 
26
  ## PROMPT TEMPLATE
27
 
 
13
  ---
14
  # StableVicuna-13B
15
 
16
+ This is an HF format unquantised float16 model of [CarperAI's StableVicuna 13B](https://huggingface.co/CarperAI/stable-vicuna-13b-delta).
17
 
18
  It is the result of merging the deltas from the above repository with the original Llama 13B weights.
19
 
20
  ## Repositories available
21
 
22
  * [4bit GPTQ models for GPU inference](https://huggingface.co/TheBloke/stable-vicuna-13B-GPTQ).
23
+ * [4-bit, 5-bit and 8-bit GGML models for CPU (+CUDA) inference](https://huggingface.co/TheBloke/stable-vicuna-13B-GGML).
24
+ * [Unquantised float16 model in HF format](https://huggingface.co/TheBloke/stable-vicuna-13B-HF).
25
 
26
  ## PROMPT TEMPLATE
27