license: gpl-3.0 | |
metrics: | |
- perplexity | |
pipeline_tag: conversational | |
tags: | |
- LLaMa | |
- text-generation-inference | |
- ggml | |
LLaMa 65B converted to ggml via LLaMa.cpp, then quantized to 4bit. | |
I recommend the following settings when running as a good starting point: main.exe -m ggml-LLaMa-65B-q4_0.bin -n -1 -t 42 -c 2048 --temp 0.35 --interactive-first --repeat_penalty 1.2 --instruct --color |