Discussion: Llama3 16k
Say Lewdiculous been enjoying your new modes and was wondering, if you'd heard about the new community Llama 3 with 16k context window and if you're planning on fine tuning it?
I don't tune, haha.
I brought this up with someone that could work on merges here:
https://huggingface.co/ChaoticNeutrals/Poppy_Porpoise-v0.6-L3-8B/discussions/1#662741c4088f0f0c9187de7b
But it seems it's not so necessary, since the regular model with 8K native context using RoPE scaling (it's automatic on KoboldCpp for example) can already handle up to 32K very well, and 16K even better.
In the Oggabooga WebUI that's the compress_pos_emb correct? If I want to use 32K tokens of context I'd set it to 4 right?
I only use GGUF models work KoboldCpp and that's gonna be my recommendation and supported format for the time being. @Nitral-AI might be able to talk about EXL2 scaling on Ooba.
In the Oggabooga WebUI that's the compress_pos_emb correct? If I want to use 32K tokens of context I'd set it to 4 right?
According to the wiki yes, 4 is quadruple the context.
compress_pos_emb: The first and original context-length extension method, discovered by kaiokendev. When set to 2, the context length is doubled, 3 and it's tripled, etc. It should only be used for models that have been fine-tuned with this parameter set to different than 1. For models that have not been tuned to have greater context length, alpha_value will lead to a smaller accuracy loss.
Say Lewdiculous You said 16k tokens wasn't appealing. Would one million+ Context tokens be a worthy challenge?
https://huggingface.co/gradientai/Llama-3-8B-Instruct-Gradient-1048k
lmao that's a lot of context haha folks operating miracles, kick Gemini where it hurts!
Say Lewdiculous You said 16k tokens wasn't appealing. Would one million+ Context tokens be a worthy challenge?
https://huggingface.co/gradientai/Llama-3-8B-Instruct-Gradient-1048k
Should i do some memes with this now i wonder?
Nitral you could potentially take on Gemini "but for the modern coomer roleplayer", KEKW.
My body and my wallet already failed after 1 million context.
Nitral you could potentially take on Gemini "but for the modern
coomerroleplayer", KEKW.My body
and my walletalready failed after 1 million context.
potentially after poppy 1.0 drops.
I was 90% joking but I can see the meme calls for your soul.
I was 90% joking but I can see the meme calls for your soul.
RTX 5090 HERE I COME!
Gonna need a few-
It's just like 190GB for 1.048M ctx at Q8, maybe 30GB with Flash-Attention?
The magic
I'd be mind blown if Nvidia to released a 48GB gpu to consumers (or as Nvidia sees consumers, cockroaches)
A 48GB RTX 6000 is $15K here, Sadly they removed the ability to see prices for A100's but they were like $30K.
They're cheaper in Australia than here.
This blows my mind though
That's without the system included :3