TheBloke commited on
Commit
b670bf2
1 Parent(s): be5bf85

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -9,7 +9,7 @@ pipeline_tag: text2text-generation
9
 
10
  These files are the result of merging the [LoRA weights of chtan's gpt4-alpaca-lora_mlp-65B](https://huggingface.co/chtan/gpt4-alpaca-lora_mlp-65b) with the original Llama 65B model.
11
 
12
- It is the result of quantising to 4bit using [GPTQ-for-LLaMa](https://github.com/qwopqwop200/GPTQ-for-LLaMa).
13
 
14
  ## Repositories available
15
 
@@ -49,7 +49,7 @@ You will need at least 48GB VRAM to use this model, either on one GPU or multipl
49
 
50
  This will work with all versions of GPTQ-for-LLaMa. It has maximum compatibility.
51
 
52
- It was created with `--act-order` to increase quantisation quality, but without groupsize so as to reduce VRAM requirements.
53
 
54
  * `gpt4-alpaca-lora_mlp-65B-GPTQ-4bit.safetensors`
55
  * Works with all versions of GPTQ-for-LLaMa code, both Triton and CUDA branches
 
9
 
10
  These files are the result of merging the [LoRA weights of chtan's gpt4-alpaca-lora_mlp-65B](https://huggingface.co/chtan/gpt4-alpaca-lora_mlp-65b) with the original Llama 65B model.
11
 
12
+ It was then quantised to 4bit using [GPTQ-for-LLaMa](https://github.com/qwopqwop200/GPTQ-for-LLaMa).
13
 
14
  ## Repositories available
15
 
 
49
 
50
  This will work with all versions of GPTQ-for-LLaMa. It has maximum compatibility.
51
 
52
+ It was created with `--act-order` to increase quantisation quality, but without group_size so as to reduce VRAM requirements.
53
 
54
  * `gpt4-alpaca-lora_mlp-65B-GPTQ-4bit.safetensors`
55
  * Works with all versions of GPTQ-for-LLaMa code, both Triton and CUDA branches