bhenrym14 commited on
Commit
24ebae7
1 Parent(s): c85cb5e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -2
README.md CHANGED
@@ -3,7 +3,7 @@ datasets:
3
  - jondurbin/airoboros-gpt4-1.4.1
4
  ---
5
 
6
- Mostly untested!
7
 
8
  Find GPTQ quantized weights here: https://huggingface.co/bhenrym14/airoboros-33b-gpt4-1.4.1-lxctx-PI-16384-GPTQ
9
 
@@ -24,7 +24,9 @@ Pretraining took 10 hours. Finetuning took ~41 hours on 1x RTX 6000 Ada.
24
 
25
  The easiest way is to use the GPTQ weights (linked above) with [oobabooga text-generation-webui](https://github.com/oobabooga/text-generation-webui) and ExLlama. You'll need to set max_seq_len to 16384 and compress_pos_emb to 8. Otherwise use the transformers module.
26
 
27
- **IMPORTANT: To use these weights you'll need to patch in the appropriate RoPE scaling module. see: [replace_llama_rope_with_scaled_rope](https://github.com/bhenrym14/qlora-airoboros-longcontext/blob/main/scaledllama/llama_rope_scaled_monkey_patch-16k.py)**
 
 
28
 
29
  ## Motivation
30
  Recent advancements in extending context by RoPE scaling ([kaiokendev](https://kaiokendev.github.io/til#extending-context-to-8k) and [meta AI)](https://arxiv.org/abs/2306.15595)) demonstrate the ability to extend the context window without (total) retraining. My prior experiments have found the following:
 
3
  - jondurbin/airoboros-gpt4-1.4.1
4
  ---
5
 
6
+ **UPDATE 8/14: I have changed the `config.json` to include the appropriate RoPE scaling specification. This model should now work with the new `Transformers` without applying any patches.**
7
 
8
  Find GPTQ quantized weights here: https://huggingface.co/bhenrym14/airoboros-33b-gpt4-1.4.1-lxctx-PI-16384-GPTQ
9
 
 
24
 
25
  The easiest way is to use the GPTQ weights (linked above) with [oobabooga text-generation-webui](https://github.com/oobabooga/text-generation-webui) and ExLlama. You'll need to set max_seq_len to 16384 and compress_pos_emb to 8. Otherwise use the transformers module.
26
 
27
+ **UPDATE 8/14: I have changed the `config.json` to include the appropriate RoPE scaling specification. This model should now work with the new `Transformers` without applying any patches.**
28
+
29
+ **If using an old version of Transformers, you will need to patch in the appropriate RoPE scaling module. see: [replace_llama_rope_with_scaled_rope](https://github.com/bhenrym14/qlora-airoboros-longcontext/blob/main/scaledllama/llama_rope_scaled_monkey_patch-16k.py)**
30
 
31
  ## Motivation
32
  Recent advancements in extending context by RoPE scaling ([kaiokendev](https://kaiokendev.github.io/til#extending-context-to-8k) and [meta AI)](https://arxiv.org/abs/2306.15595)) demonstrate the ability to extend the context window without (total) retraining. My prior experiments have found the following: