[UNUSEDTOKEN145]
#15
by
MoonRide
- opened
Still something wrong with tokenizer (or the config). Reproduction steps below, using current version (b2901) of llama.cpp.
Steps:
- Convert to GGUF:
convert-hf-to-gguf.py --outtype f16 ..\InternLM2-Chat-7B\ --outfile InternLM2-Chat-7B-F16.gguf
. - Quantize to Q6_K:
quantize.exe .\InternLM2-Chat-7B-F16.gguf .\InternLM2-Chat-7B-Q6_K.gguf Q6_K
. - Launch server:
server -v -ngl 99 -m InternLM2-Chat-7B-Q6_K.gguf -n 300 -c 32768 --chat-template chatml
- Open http://localhost:8080/, and setup it as below (pretty standard generic configuration):
- Start talking:
Outcome: something wrong with tokenization, [UNUSEDTOKEN145] appears instead of end of turn.
Expected outcome: conversation in turns working properly, [UNUSEDTOKEN145] not appearing in conversation.