inference with anything over 2k tokens causes the following error.

#1
by winglian - opened

RuntimeError: The size of tensor a (4096) must match the size of tensor b (4097) at non-singleton dimension 3

Sign up or log in to comment