Report a bug?

#7
by PotatoesJay - opened

Reproduce:

  1. Download this repo's weight;
  2. python3 -m lmdeploy.serve.turbomind.deploy internlm-chat-7b /path/to/internlm-chat-7b;
  3. python3 -m lmdeploy.turbomind.chat ./workspace;
  4. type in a very long input, reach max 2056 tokens and it warns exeed max input length?

Issue:
find turbomind/turbomind.py, to 101th row, self.session_len is always 2048?

InternLM org

set session_len in workspace/triton_models/weights/config.ini please

PotatoesJay changed discussion status to closed

Sign up or log in to comment