bhenrym14
/

llama-33b-lxctx-PI-16384-LoRA

Model card Files Files and versions Community

bhenrym14 commited on Jul 17, 2023

Commit

619a609

•

1 Parent(s): a6f9dde

Create README.md

Files changed (1) hide show

README.md +15 -0

README.md ADDED Viewed

	@@ -0,0 +1,15 @@

+Mostly untested!
+# RoPE Scaled QLoRA Long Context Extension of Llama-33b (LoRA)
+## Overview
+This is base Llama-33b with minimal additional training to extend the useful context window.
+- Context length extended to 16384 by RoPE Scaled Embeddings (Position Interpolation).
+- Pretrained for additional 100 steps on 8192 length sequences from the pile dataset.
+- The merged model is used as the starting point for training [bhenrym14/airoboros-33b-gpt4-1.4.1-lxctx-PI-16384-LoRA](https://huggingface.co/bhenrym14/airoboros-33b-gpt4-1.4.1-lxctx-PI-16384-LoRA)
+**This is a QLoRA fine-tune**
+Pretraining took 10 hours on 1x RTX 6000 Ada.