markpreemo
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -15,6 +15,8 @@ Join our custom agent and long context (262k-1M+) waitlist: https://forms.gle/L6
|
|
15 |
|
16 |
Gradient incorporates your data to deploy autonomous assistants that power critical operations across your business. To learn more or collaborate on a custom model, drop us a message at contact@gradient.ai.
|
17 |
|
|
|
|
|
18 |
This model extends LLama-3 8B's context length from 8k to > 160K, developed by Gradient, sponsored by compute from [Crusoe Energy](https://huggingface.co/crusoeai). It demonstrates that SOTA LLMs can learn to operate on long context with minimal training (< 200M tokens) by appropriately adjusting RoPE theta.
|
19 |
|
20 |
**Update (5/3): We further fine-tuned our model to strengthen its assistant-like chat ability as well. The NIAH result is updated.**
|
|
|
15 |
|
16 |
Gradient incorporates your data to deploy autonomous assistants that power critical operations across your business. To learn more or collaborate on a custom model, drop us a message at contact@gradient.ai.
|
17 |
|
18 |
+
[Join our Discord](https://discord.com/invite/2QVy2qt2mf)
|
19 |
+
|
20 |
This model extends LLama-3 8B's context length from 8k to > 160K, developed by Gradient, sponsored by compute from [Crusoe Energy](https://huggingface.co/crusoeai). It demonstrates that SOTA LLMs can learn to operate on long context with minimal training (< 200M tokens) by appropriately adjusting RoPE theta.
|
21 |
|
22 |
**Update (5/3): We further fine-tuned our model to strengthen its assistant-like chat ability as well. The NIAH result is updated.**
|