gradientai
/

Llama-3-8B-Instruct-262k

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

forrest-gradient commited on May 4, 2024

Commit

dc9a5ae

·

verified ·

1 Parent(s): 008b982

Update README.md

Files changed (1) hide show

README.md +3 -0

README.md CHANGED Viewed

@@ -10,6 +10,9 @@ license: llama3
 <img src="https://cdn-uploads.huggingface.co/production/uploads/655bb613e8a8971e89944f3e/TSa3V8YpoVagnTYgxiLaO.png" width="200"/>
 # Llama-3 8B Gradient Instruct 262k
 Gradient incorporates your data to deploy autonomous assistants that power critical operations across your business. To learn more or collaborate on a custom model, drop us a message at contact@gradient.ai.
 This model extends LLama-3 8B's context length from 8k to > 160K, developed by Gradient, sponsored by compute from [Crusoe Energy](https://huggingface.co/crusoeai). It demonstrates that SOTA LLMs can learn to operate on long context with minimal training (< 200M tokens) by appropriately adjusting RoPE theta.

 <img src="https://cdn-uploads.huggingface.co/production/uploads/655bb613e8a8971e89944f3e/TSa3V8YpoVagnTYgxiLaO.png" width="200"/>
 # Llama-3 8B Gradient Instruct 262k
+Join our custom agent and long context (262k-1M+) waitlist: https://forms.gle/L6TDY7dozx8TuoUv7
 Gradient incorporates your data to deploy autonomous assistants that power critical operations across your business. To learn more or collaborate on a custom model, drop us a message at contact@gradient.ai.
 This model extends LLama-3 8B's context length from 8k to > 160K, developed by Gradient, sponsored by compute from [Crusoe Energy](https://huggingface.co/crusoeai). It demonstrates that SOTA LLMs can learn to operate on long context with minimal training (< 200M tokens) by appropriately adjusting RoPE theta.