bhenrym14 commited on
Commit
619a609
1 Parent(s): a6f9dde

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +15 -0
README.md ADDED
@@ -0,0 +1,15 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Mostly untested!
2
+
3
+ # RoPE Scaled QLoRA Long Context Extension of Llama-33b (LoRA)
4
+
5
+ ## Overview
6
+
7
+ This is base Llama-33b with minimal additional training to extend the useful context window.
8
+ - Context length extended to 16384 by RoPE Scaled Embeddings (Position Interpolation).
9
+ - Pretrained for additional 100 steps on 8192 length sequences from the pile dataset.
10
+ - The merged model is used as the starting point for training [bhenrym14/airoboros-33b-gpt4-1.4.1-lxctx-PI-16384-LoRA](https://huggingface.co/bhenrym14/airoboros-33b-gpt4-1.4.1-lxctx-PI-16384-LoRA)
11
+
12
+ **This is a QLoRA fine-tune**
13
+
14
+ Pretraining took 10 hours on 1x RTX 6000 Ada.
15
+