How do you run this?

#2
by LaferriereJC - opened

I tried text-generation-webui, and it detects llama but fails to run yet I can run

python server.py --model ggml-alpaca-7b-q4 --listen

I tried cformer with M1 mac. But its response is only blank.
If I input "hi" on ">" , nothing appears . And ">" again.

> Hi


>

I tried to modify interface as such
# stablelm
'cakewalk/ggml-q4_0-stablelm-tuned-alpha-7b': ModelUrlMap(
cpp_model_name="gptneox",
int4_fixed_zero="https://huggingface.co/cakewalk/ggml-q4_0-stablelm-tuned-alpha-7b/resolve/main/ggml-model-stablelm-tuned-alpha-7b-q4_0.bin"),

and chat.py
model_map = {'stablelm': 'cakewalk/ggml-q4_0-stablelm-tuned-alpha-7b', 'pythia': 'OpenAssistant/oasst-sft-1-pythia-12b', 'bloom': 'bigscience/bloom-7b1', 'gptj': 'Eleuther$

but when I attempt to run
python chat.py -m stablelm

I get an error
cakewalk/ggml-q4_0-stablelm-tuned-alpha-7b does not appear to have a file named config.json

but none of the other models have config.json

Do you have instructions for how to set this up?

I assumed gptneox from looking at the config.json's for the stablelm's models (i.e. https://huggingface.co/stabilityai/stablelm-tuned-alpha-3b/blob/main/config.json)

There is a Windows fork which, when started, will ask which file to select https://github.com/LostRuins/koboldcpp
Support GGML models
!!!Pardon - this model does not start

I see the cformers link you provided had been updated to include an option for this model.

My concern is it's not going to respect the stop tokens identified in
https://huggingface.co/vvsotnikov/stablelm-tuned-alpha-3b-16bit

as a result, I'm looking at the above model with deepspeed and hard coded the class into text-generation_webui's load_preset_values

The code worked with cformers, and it was fast, but it was generating a lot of run-on sentences.

will this work with https://github.com/ggerganov/ggml/tree/master/examples/stablelm ? Seems like this model has the incorrect n_ctx and n_embd sizes?

main: seed = 1682298523
stablelm_model_load: loading model from '../../models/cakewalk__ggml-q4_0-stablelm-tuned-alpha-7b/ggml-model-stablelm-tuned-alpha-7b-q4_0.bin' - please wait ...
stablelm_model_load: n_vocab = 50432
stablelm_model_load: n_ctx = 6144
stablelm_model_load: n_embd = 48
stablelm_model_load: n_head = 16
stablelm_model_load: n_layer = 32
stablelm_model_load: n_rot = 1
stablelm_model_load: ftype = 2
stablelm_model_load: ggml ctx size = 75.89 MB
stablelm_model_load: memory_size = 36.00 MB, n_mem = 196608
stablelm_model_load: tensor 'gpt_neox.embed_in.weight' has wrong size in model file
main: failed to load model from '../../models/cakewalk__ggml-q4_0-stablelm-tuned-alpha-7b/ggml-model-stablelm-tuned-alpha-7b-q4_0.bin'
stablelm_model_load: %

Sign up or log in to comment