gradientai
/

Llama-3-8B-Instruct-262k

Text Generation

Inference Endpoints

text-generation-inference

Model card Files Files and versions Community

markpreemo commited on 29 days ago

Commit

070a5db

•

1 Parent(s): dc9a5ae

Update README.md

Files changed (1) hide show

README.md +2 -0

README.md CHANGED Viewed

@@ -15,6 +15,8 @@ Join our custom agent and long context (262k-1M+) waitlist: https://forms.gle/L6
 Gradient incorporates your data to deploy autonomous assistants that power critical operations across your business. To learn more or collaborate on a custom model, drop us a message at contact@gradient.ai.
 This model extends LLama-3 8B's context length from 8k to > 160K, developed by Gradient, sponsored by compute from [Crusoe Energy](https://huggingface.co/crusoeai). It demonstrates that SOTA LLMs can learn to operate on long context with minimal training (< 200M tokens) by appropriately adjusting RoPE theta.
 **Update (5/3): We further fine-tuned our model to strengthen its assistant-like chat ability as well. The NIAH result is updated.**

 Gradient incorporates your data to deploy autonomous assistants that power critical operations across your business. To learn more or collaborate on a custom model, drop us a message at contact@gradient.ai.
+[Join our Discord](https://discord.com/invite/2QVy2qt2mf)
 This model extends LLama-3 8B's context length from 8k to > 160K, developed by Gradient, sponsored by compute from [Crusoe Energy](https://huggingface.co/crusoeai). It demonstrates that SOTA LLMs can learn to operate on long context with minimal training (< 200M tokens) by appropriately adjusting RoPE theta.
 **Update (5/3): We further fine-tuned our model to strengthen its assistant-like chat ability as well. The NIAH result is updated.**