Update README.md
Browse files
README.md
CHANGED
@@ -9,7 +9,7 @@ datasets:
|
|
9 |
|
10 |
|
11 |
<!-- LoRA Weights can be found here: https://huggingface.co/bhenrym14/airophin-13b-pntk-16k-LoRA -->
|
12 |
-
|
13 |
|
14 |
## Overview
|
15 |
|
@@ -27,8 +27,8 @@ All training was performed with 1x RTX 6000 Ada.
|
|
27 |
|
28 |
This model employs [Partial NTK Rope Scaling](https://github.com/jquesnelle/scaled-rope/pull/1). This methodology is not yet implemented natively in Transformers or Exllama (as of 7/21). There are three options to run this.
|
29 |
1. Transformers (use bnb for quantization). Use [fp16 weights](https://huggingface.co/bhenrym14/airophin-13b-pntk-16k-fp16). This will require replacing the `LlamaEmbedding` with `LlamaPartNTKScaledRotaryEmbedding`, with `max_position_embeddings=16384` and `original_max_position_embeddings=4096`. A monkeypatch can be found [here](https://github.com/bhenrym14/qlora-airoboros-longcontext/blob/main/scaledllama/llama_pntk_monkey_patch.py).
|
30 |
-
2. Autogptq/GPTQ-for-Llama.
|
31 |
-
3. Use ExLLama,
|
32 |
|
33 |
Please comment with any questions. This hasn't been extensively tested.
|
34 |
|
|
|
9 |
|
10 |
|
11 |
<!-- LoRA Weights can be found here: https://huggingface.co/bhenrym14/airophin-13b-pntk-16k-LoRA -->
|
12 |
+
GPTQ weights can be found here: https://huggingface.co/bhenrym14/airophin-13b-pntk-16k-GPTQ
|
13 |
|
14 |
## Overview
|
15 |
|
|
|
27 |
|
28 |
This model employs [Partial NTK Rope Scaling](https://github.com/jquesnelle/scaled-rope/pull/1). This methodology is not yet implemented natively in Transformers or Exllama (as of 7/21). There are three options to run this.
|
29 |
1. Transformers (use bnb for quantization). Use [fp16 weights](https://huggingface.co/bhenrym14/airophin-13b-pntk-16k-fp16). This will require replacing the `LlamaEmbedding` with `LlamaPartNTKScaledRotaryEmbedding`, with `max_position_embeddings=16384` and `original_max_position_embeddings=4096`. A monkeypatch can be found [here](https://github.com/bhenrym14/qlora-airoboros-longcontext/blob/main/scaledllama/llama_pntk_monkey_patch.py).
|
30 |
+
2. Autogptq/GPTQ-for-Llama. See the [GPTQ weights](https://huggingface.co/bhenrym14/airophin-13b-pntk-16k-GPTQ)
|
31 |
+
3. Use ExLLama, see the [GPTQ weights](https://huggingface.co/bhenrym14/airophin-13b-pntk-16k-GPTQ)
|
32 |
|
33 |
Please comment with any questions. This hasn't been extensively tested.
|
34 |
|