May 17, 2023

I have download the q5_1.bin and try to run in llama.cpp and koboldcpp, but it does not work.
I have checked the file SHA256 and it is the same.
Here is the llama.cpp's error code:

main -m ./models/starcoder-ggml-q5_1.bin -t 12 -n -1 -c 2048 --keep -1 --repeat_last_n 2048 --top_k 160 --top_p 0.95 --color -ins -r "User:" --keep -1 --interactive-first
main: build = 536 (cdd5350)
main: seed  = 1684312164
llama.cpp: loading model from ./models/starcoder-ggml-q5_1.bin
error loading model: missing tok_embeddings.weight
llama_init_from_file: failed to load model
llama_init_from_gpt_params: error: failed to load model './models/starcoder-ggml-q5_1.bin'
main: error: unable to load model

certutil -hashfile starcoder-ggml-q5_1.bin SHA256
SHA256 hash of starcoder-ggml-q5_1.bin:
c52e0cd23878c3373a8a7f6adb484a00dbae11b1d6bbd84aa20e82378cbb4bfa
CertUtil: -hashfile command completed successfully.

And here is koboldcpp:
```
Welcome to KoboldCpp - Version 1.21.1
For command line arguments, please refer to --help
Otherwise, please manually select ggml file:
Attempting to use OpenBLAS library for faster prompt ingestion. A compatible libopenblas will be required.
Initializing dynamic library: koboldcpp_openblas.dll

Loading model: D:\program\koboldcpp\starcoder-ggml-q5_1.bin
[Threads: 12, BlasThreads: 12, SmartContext: True]

Identified as GPT-NEO-X model: (ver 401)
Attempting to Load...

NeoDim

Owner May 17, 2023

There is an issue in llama.cpp repo - https://github.com/ggerganov/llama.cpp/issues/1441

For now there is only example code here - https://github.com/ggerganov/ggml/tree/master/examples/starcoder

This code works, but not very useful: it loads model, generates reply to single prompt and shutting down. Now I keep experimenting with this code to get conversation loop, but have troubles with it - looks like I didn't get how to correctly manage memory. It breaks after single iteration of loop with "not enough memory in context". Will see if I can do better.

NeoDim

Owner May 17, 2023

Also relates - https://github.com/LostRuins/koboldcpp/issues/181

NeoDim

Owner May 27, 2023

For now koboldcpp supports starcoder gglm models.

NeoDim
/

starcoder-GGML

Cannot run on llama.cpp and koboldcpp

Identified as GPT-NEO-X model: (ver 401)
Attempting to Load...

Cannot run on llama.cpp and koboldcpp

Identified as GPT-NEO-X model: (ver 401)Attempting to Load...

Identified as GPT-NEO-X model: (ver 401)
Attempting to Load...