amazing

#3
by Akicou - opened

the context window is soo long 😍 nvidia is cooking

How much long it is?

How much long it is?

The max position embeddings are set to 262144 (set in the config.json) so basically 256K tokens which is 1/4th of a Million Token context window

Its much longer than the usual open weight diffusion based llms (f.e. inclusionAI/LLaDA2.1-flash relaxe-system-lab/UltraLLaDA)

Sign up or log in to comment