leo-pekelis-gradient commited on
Commit
5cfd414
1 Parent(s): 9411de7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -3
README.md CHANGED
@@ -7,10 +7,9 @@ tags:
7
  - llama-3
8
  ---
9
 
 
10
 
11
- ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6585dc9be92bc5f258156bd6/F2WLF8_jOx_gttxbPtLK1.png)
12
-
13
- This model extends LLama-3 8B's context length from 8k to > 130K, developed by Gradient, sponsored by compute from Crusoe Energy. It demonstrates that SOTA LLMs can learn to operate on long context with minimal training (< 200M tokens) by appropriately adjusting RoPE theta.
14
 
15
  **Approach:**
16
 
 
7
  - llama-3
8
  ---
9
 
10
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6585dc9be92bc5f258156bd6/hiHWva3CbsrnPvZTp5-lu.png)
11
 
12
+ This model extends LLama-3 8B's context length from 8k to > 160K, developed by Gradient, sponsored by compute from Crusoe Energy. It demonstrates that SOTA LLMs can learn to operate on long context with minimal training (< 200M tokens) by appropriately adjusting RoPE theta.
 
 
13
 
14
  **Approach:**
15