kingbri
/

airo-llongma-2-13B-16k-GPTQ

Text Generation

Inference Endpoints

Model card Files Files and versions Community

kingbri commited on Jul 30, 2023

Commit

b26e96f

•

1 Parent(s): 4fc86c3

Update README.md

Files changed (1) hide show

README.md +3 -1

README.md CHANGED Viewed

@@ -7,4 +7,6 @@ This is a GPTQ quantized version of [airo-llongma-2-13B-16k](https://huggingface
 To run this model, make sure `compress_pos_emb` is set to 4 to apply proper rope scaling parameters. The `max_ctx_len` is 16384.
-The main branch of this repository is a 4bit 128g model with act order set to false. Sequence length was 4096 when quantizing.

 To run this model, make sure `compress_pos_emb` is set to 4 to apply proper rope scaling parameters. The `max_ctx_len` is 16384.
+Branches:
+- main: 4 bits, groupsize 128, act order false
+- 4bit-32g-actorder: 4 bits, groupsize 32, act order true