Undi95/MLewd-L2-13B-v2-2-GGUF

When running this model I am receiving following:
llm_load_print_meta: format = GGUF V2 (latest)
llm_load_print_meta: n_ctx_train = 4096
...

That means that this model performs best if context is set to 4096 . You are still able to extend the context window with this model though.

I have tested extending context to 16K using rope (parameters in llama.cpp in case you using this software) to summarize longer articles and it worked fine:
llm_load_print_meta: format = GGUF V2 (latest)
llm_load_print_meta: n_ctx_train = 4096
llm_load_print_meta: n_ctx = 16384
....

But overall, it is probably best to stick with 4K context window to avoid degradation and higher memory requirements when using this model for higher context window utilizing rope extension.

Undi95
/

MLewd-L2-13B-v2-2-GGUF

Maximum context