Broken tokenizer?

by ChuckMcSneed - opened Feb 7, 2024

Discussion

ChuckMcSneed

Feb 7, 2024

It spits out numbers and repeats text sometimes. Not very good.

senseable

Owner Feb 7, 2024

Which format.. what hardware are you using?

ChuckMcSneed

Feb 7, 2024

Q5KS, tried running it on llama.cpp and kobold. RTX3080, intel, not sure how it's relevant. How did you get it to quantize? Which scripts did you use? I can't quantize it with default settings in llama.cpp.

senseable

Owner Feb 7, 2024

Your GPU's probably too small, I'd suggest a smaller model.

ChuckMcSneed

Feb 7, 2024

I'm not running it on GPU, I'm running it on CPU with CuBLAS processing. If I had memory problems I wouldn't be able to run it at all. Just tell me how you got it quantized.

senseable

Owner Feb 7, 2024

Ah I see you have 70B models. Just check the README

ChuckMcSneed

Feb 8, 2024

Oh... It has BPE vocab... That's why it didn't convert. I converted it on my own machine and the issue seems to persist. Must have been inherited from MoMo: https://huggingface.co/moreh/MoMo-72B-lora-1.8.6-DPO/discussions/7

ChuckMcSneed changed discussion status to closed Feb 8, 2024

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment