cognitivecomputations/laserxtral-exl2

Jan 16

•

Hi,

does this model support 32k context withought NTK RoPE scaling?
From what I can see merged models have 8k context and some 16k, so my guess is this model rather be limited to 8k context.

altomek changed discussion status to closed Jan 16

altomek changed discussion status to open Jan 16

rjmehta

Jan 16

This is a mixtral exllama model. Refer to the config.json of the non-quantized model for context size. Mixtral supports 32k context.

altomek

Jan 17

Yes, the config.json specifies a context length of 32k. However, all MoE merges I tried failed above 8k context. It appears that without further adjustments, they can not reaching full context.

altomek changed discussion status to closed Feb 21

Suparious

Cognitive Computations org Feb 22

•

edited Feb 22

Yes, the config.json specifies a context length of 32k. However, all MoE merges I tried failed above 8k context. It appears that without further adjustments, they can not reaching full context.

It depends on the available memory and how you are running the inference. To get 8 or 16K context, will take an enormous amount of memory to handle the context. This relates to the inefficiencies of how each batch or chunk needs to setup the whole inference pipeline. So, as you increase the context window, the complexity and memory requirements scale at a much greater rate.

cognitivecomputations
/

laserxtral-exl2

CTX size.