Text Generation
Transformers
Safetensors
llama
conversational
Inference Endpoints
text-generation-inference

For the original 200k context, would it be better to do an ntk patchwith 4k?patch

#5
by Trangle - opened

From the short text model expansion, will use a magnification factor greater than 1, here to use a reduction factor less than 1 to 4k around will be better, also does not affect the subsequent expansion, may have a good impact on the original model long text ability?

Sign up or log in to comment