Update README.md
Browse files
README.md
CHANGED
@@ -17,4 +17,8 @@ This is [Jon Durbin's Airoboros 33B GPT4 1.4](https://huggingface.co/jondurbin/a
|
|
17 |
- Training sequences beyond 2048 have the target truncated to equal 2048.
|
18 |
- Used airoboros-gpt4-1.4.1 dataset instead of airoboros-gpt4-1.4
|
19 |
|
20 |
-
Otherwise, I emulated the training process as closely as possible (rank 64 QLoRA) It was trained on 1x RTX 6000 Ada for ~43 hours.
|
|
|
|
|
|
|
|
|
|
17 |
- Training sequences beyond 2048 have the target truncated to equal 2048.
|
18 |
- Used airoboros-gpt4-1.4.1 dataset instead of airoboros-gpt4-1.4
|
19 |
|
20 |
+
Otherwise, I emulated the training process as closely as possible (rank 64 QLoRA) It was trained on 1x RTX 6000 Ada for ~43 hours.
|
21 |
+
|
22 |
+
## NTK Patch
|
23 |
+
|
24 |
+
To use with HF transformers, AutoGPTQ, etc. See (NTK monkey patch)[https://github.com/bhenrym14/qlora-airoboros-longcontext/blob/main/scaledllama/llama_rope_ntk_monkey_patch.py].
|