YeungNLP commited on
Commit
d29069d
1 Parent(s): 4889f86

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +27 -0
README.md ADDED
@@ -0,0 +1,27 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ ---
6
+
7
+ # LongQLoRA: Efficient and Effective Method to Extend Context Length of LLMs
8
+
9
+ ## Technical Report
10
+
11
+ Technical Report: [LongQLoRA: Efficient and Effective Method to Extend Context Length of Large Language Models](https://arxiv.org/abs/2311.04879)
12
+
13
+ ## Introduction
14
+ LongQLoRA is a memory-efficient and effective method to extend context length of Large Language Models with less training GPUs.
15
+ **On a single 32GB V100 GPU**, LongQLoRA can extend the context length of LLaMA2 7B and 13B from 4096 to 8192 and even to 12k.
16
+ LongQLoRA achieves competitive perplexity performance on PG19 and Proof-pile dataset after only 1000 finetuning steps, our model outperforms LongLoRA and is very close to MPT-7B-8K.
17
+
18
+
19
+ Evaluation perplexity on PG19 validation and Proof-pile test datasets in evaluation context length of 8192:
20
+
21
+ | Model | PG19 | Proof-pile |
22
+ |---------------------|----------|------------|
23
+ | LLaMA2-7B | \>1000 | \>1000 |
24
+ | MPT-7B-8K | 7.98 | 2.67 |
25
+ | LongLoRA-LoRA-7B-8K | 8.20 | 2.78 |
26
+ | LongLoRA-Full-7B-8K | 7.93 | 2.73 |
27
+ | **LongQLoRA-7B-8K** | **7.96** | **2.73** |