Text Generation
Transformers
English
llama
sft
text-generation-inference

llamacpp command for it is correct?

#1
by mirek190 - opened

Hi bloke

Command for llamacpp is correct for this model ?

./main -t 10 -ngl 32 -m openassistant-llama2-13b-orca-8k-3319.ggmlv3.q4_0.bin --color -c 2048 --temp 0.7 --repeat_penalty 1.1 -n -1 -p "### Instruction: Write a story about llamas\n### Response:"

It is model is 8k so should be -c 8192 or at least like for llama2 models -c 4096 ?

It should be -c 4096 as a base, and then if you want 8192 you need to use the RoPE parameters which I think are:
-c <contextsize> --rope-freq-base 10000 --rope-freq-scale 0.5"

I will update the README shortly

Also that prompt template is wrong. My README creation code doesn't handle putting the right prompt template in the llama.cpp example code yet, I need to fix that

mirek190 changed discussion title from Command for it is correct? to llamacpp command for it is correct?

Sign up or log in to comment