KnutJaegersberg
commited on
Commit
•
a9bfb02
1
Parent(s):
41e092b
Update README.md
Browse files
README.md
CHANGED
@@ -7,6 +7,8 @@ pipeline_tag: text-generation
|
|
7 |
|
8 |
This is a model collection of mostly larger LLMs quantized to 2 bit with the novel quip# inspired approach in llama.cpp
|
9 |
Sometimes both xs and xxs are available.
|
|
|
|
|
10 |
|
11 |
### Overview
|
12 |
- Senku-70b
|
|
|
7 |
|
8 |
This is a model collection of mostly larger LLMs quantized to 2 bit with the novel quip# inspired approach in llama.cpp
|
9 |
Sometimes both xs and xxs are available.
|
10 |
+
Note that for some larger models, like Qwen-72b based models, the context length might be too large for most GPUs, so you have to reduce it yourself in textgen-webui via the n_ctx setting.
|
11 |
+
Rope scaling for scaled models like longalpaca or yarn should be 8, set compress_pos_emb accordingly.
|
12 |
|
13 |
### Overview
|
14 |
- Senku-70b
|