mustafaaljadery
commited on
Commit
•
3861cc6
1
Parent(s):
807fe15
Update README.md
Browse files
README.md
CHANGED
@@ -5,14 +5,14 @@ license: mit
|
|
5 |
|
6 |
Gemma 2B with recurrent local attention with context length of up to 10M. Our implemenation uses **<32GB** of memory!
|
7 |
|
8 |
-
![Graphic of our implementation context](./
|
9 |
|
10 |
**Features:**
|
11 |
|
12 |
- 10M sequence length on Gemma 2B.
|
13 |
-
- Runs on less
|
14 |
-
- Native inference
|
15 |
-
-
|
16 |
|
17 |
## Quick Start
|
18 |
|
|
|
5 |
|
6 |
Gemma 2B with recurrent local attention with context length of up to 10M. Our implemenation uses **<32GB** of memory!
|
7 |
|
8 |
+
![Graphic of our implementation context](./graphic.png)
|
9 |
|
10 |
**Features:**
|
11 |
|
12 |
- 10M sequence length on Gemma 2B.
|
13 |
+
- Runs on less than 32GB of memory.
|
14 |
+
- Native inference optimized for cuda.
|
15 |
+
- Recurrent local attention for O(N) memory.
|
16 |
|
17 |
## Quick Start
|
18 |
|