Discussion: Llama3 16k

#31
by Vi6DDarkKing - opened

Say Lewdiculous been enjoying your new modes and was wondering, if you'd heard about the new community Llama 3 with 16k context window and if you're planning on fine tuning it?

Lewdiculous changed discussion title from Llama3 16k to Discussion: Llama3 16k

I don't tune, haha.

I brought this up with someone that could work on merges here:
https://huggingface.co/ChaoticNeutrals/Poppy_Porpoise-v0.6-L3-8B/discussions/1#662741c4088f0f0c9187de7b

But it seems it's not so necessary, since the regular model with 8K native context using RoPE scaling (it's automatic on KoboldCpp for example) can already handle up to 32K very well, and 16K even better.

Lewdiculous changed discussion status to closed

In the Oggabooga WebUI that's the compress_pos_emb correct? If I want to use 32K tokens of context I'd set it to 4 right?

I only use GGUF models work KoboldCpp and that's gonna be my recommendation and supported format for the time being. @Nitral-AI might be able to talk about EXL2 scaling on Ooba.

In the Oggabooga WebUI that's the compress_pos_emb correct? If I want to use 32K tokens of context I'd set it to 4 right?

According to the wiki yes, 4 is quadruple the context.

compress_pos_emb: The first and original context-length extension method, discovered by kaiokendev. When set to 2, the context length is doubled, 3 and it's tripled, etc. It should only be used for models that have been fine-tuned with this parameter set to different than 1. For models that have not been tuned to have greater context length, alpha_value will lead to a smaller accuracy loss.

Say Lewdiculous You said 16k tokens wasn't appealing. Would one million+ Context tokens be a worthy challenge?

https://huggingface.co/gradientai/Llama-3-8B-Instruct-Gradient-1048k

lmao that's a lot of context haha folks operating miracles, kick Gemini where it hurts!

Say Lewdiculous You said 16k tokens wasn't appealing. Would one million+ Context tokens be a worthy challenge?

https://huggingface.co/gradientai/Llama-3-8B-Instruct-Gradient-1048k

Should i do some memes with this now i wonder?

Nitral you could potentially take on Gemini "but for the modern coomer roleplayer", KEKW.

My body and my wallet already failed after 1 million context.

Nitral you could potentially take on Gemini "but for the modern coomer roleplayer", KEKW.

My body and my wallet already failed after 1 million context.

potentially after poppy 1.0 drops.

I was 90% joking but I can see the meme calls for your soul.

I was 90% joking but I can see the meme calls for your soul.

RTX 5090 HERE I COME!

Gonna need a few-
It's just like 190GB for 1.048M ctx at Q8, maybe 30GB with Flash-Attention?
The magic
I'd be mind blown if Nvidia to released a 48GB gpu to consumers (or as Nvidia sees consumers, cockroaches)
A 48GB RTX 6000 is $15K here, Sadly they removed the ability to see prices for A100's but they were like $30K.
They're cheaper in Australia than here.
image.png
This blows my mind though
image.png
That's without the system included :3

Sign up or log in to comment