Update README.md
Browse files
README.md
CHANGED
|
@@ -5,14 +5,14 @@ license: mit
|
|
| 5 |
|
| 6 |
Gemma 2B with recurrent local attention with context length of up to 10M. Our implemenation uses **<32GB** of memory!
|
| 7 |
|
| 8 |
-

|
| 9 |
|
| 10 |
**Features:**
|
| 11 |
|
| 12 |
- 10M sequence length on Gemma 2B.
|
| 13 |
+
- Runs on less than 32GB of memory.
|
| 14 |
+
- Native inference optimized for cuda.
|
| 15 |
+
- Recurrent local attention for O(N) memory.
|
| 16 |
|
| 17 |
## Quick Start
|
| 18 |
|