Text Generation
Transformers
Safetensors
llama
Generated from Trainer
axolotl
conversational
Inference Endpoints
text-generation-inference

Oh Come on

#1
by supercharge19 - opened

What is 1m? That can't be context length, is it?

If that is what is the ram requirement for that, 1 million tokens?

Cognitive Computations org

It is indeed context length. You'd likely need 506.25gb of vram to run it at full 1m context length, though you can lower the ctx until it fits in your available ram.

I estimated that vram requirement using this tool: https://llm-calc.rayfernando.ai/

Cognitive Computations org

What is 1m? That can't be context length, is it?

If that is what is the ram requirement for that, 1 million tokens?

wait until I'm done uploading it. Also i have lots of model cards to do and also, you aren't paying me anything, I work at my pace.

ehartford changed discussion status to closed
Cognitive Computations org

@ehartford I can help you if you need help :)

It is indeed context length. You'd likely need 506.25gb of vram to run it at full 1m context length, though you can lower the ctx until it fits in your available ram.

I estimated that vram requirement using this tool: https://llm-calc.rayfernando.ai/

Thanks that is very useful recourse.

24GB of RAM was just barely enough for 16K context for me so one million sounds like it would need a lot more than 500 GB.

Cognitive Computations org

Perhaps - but the way ctx length is computed and stored in memory isn't exactly linear. Having 48gb of vram doesn't mean you'll be stuck at just 32k context length. The model only takes up a fixed amount of space as well.

This is regular RAM, not VRAM. Is that worse?

Cognitive Computations org

This is regular RAM, not VRAM. Is that worse?

Depends who you are but y'all are mixing using the words ram with VRAM because llama.cpp exists. VRAM is generally worse to need more.

https://llm-calc.rayfernando.ai/ seems down right now or it's just me?

I meant I am not using my GPU at all (it sucks anyway).
The module inside the link is down for me, it doesn't do any calculation, so it's not just you.

https://llm-calc.rayfernando.ai/ seems down right now or it's just me?

I had to do a re-deploy of the app and it is back up and running now.

Thank you all so much for the shoutout and for using the tool!

Cognitive Computations org

Thanks @RayFernando1337 , and now I know how to tag you - too!

Still says "embeds.beehiiv.com refused to connect" on the module for me.

Cognitive Computations org

Works for me - try a chromium based browser.

Sign up or log in to comment