Gibberish output when using 24k context length

#1
by rjmehta - opened

The max position embeddings show 32k but when used on 24k context length, it starts gibberish output. Is this 32k or 4k length? I am using transformers 4.34.0

@TheBloke Please advise.

This is 8k. The 32k context is possible with sliding window attention but thats only possible in huggingface transformers.

Since this is finetuned from mistral, it does have around 8k(probably slightly less) but not 32k.

rjmehta changed discussion status to closed

Sign up or log in to comment