Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,15 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
Mostly untested!
|
2 |
+
|
3 |
+
# RoPE Scaled QLoRA Long Context Extension of Llama-33b (LoRA)
|
4 |
+
|
5 |
+
## Overview
|
6 |
+
|
7 |
+
This is base Llama-33b with minimal additional training to extend the useful context window.
|
8 |
+
- Context length extended to 16384 by RoPE Scaled Embeddings (Position Interpolation).
|
9 |
+
- Pretrained for additional 100 steps on 8192 length sequences from the pile dataset.
|
10 |
+
- The merged model is used as the starting point for training [bhenrym14/airoboros-33b-gpt4-1.4.1-lxctx-PI-16384-LoRA](https://huggingface.co/bhenrym14/airoboros-33b-gpt4-1.4.1-lxctx-PI-16384-LoRA)
|
11 |
+
|
12 |
+
**This is a QLoRA fine-tune**
|
13 |
+
|
14 |
+
Pretraining took 10 hours on 1x RTX 6000 Ada.
|
15 |
+
|