No SWA ?

#6
by byunal - opened

I think this still suffers from sequences that have higher length than 512. However, mistral should be solving this by using SWA. How to tackle this issue ? or are there any similar or lightweight model for this ?

@byunal hmm no it has a much higher context length then 512. Infact all llama, qwen, mistral models have higher context then 2048?

I think you are using something like ctransformers or llama cpp python which sets the context limit as 512, you have to change it to your desired length.

@YaTharThShaRma999 Actually yes. I'm trying to use this model for text summarization on CPU over ctransformers. Currently, I have no access to any GPU so I have to do inference on CPU. Frankly, I didn't know that ctansformers limits the context length. How can I neglect this limit on CPU ? I'd appreciated if you can help.

I have the same issue, where ctransformers is limiting it to 512. I thought it was something wrong on the model configuration.
Were you able to solve this issue?

EDIT: https://discuss.huggingface.co/t/number-of-tokens-2331-exceeded-maximum-context-length-512-error-even-when-model-supports-8k-context-length/57180/6
I just used the argument in the function as suggested in the link above.

@esuriddick Didn't solved and I'm done dealing with, but thanks .

Sign up or log in to comment