How to run this Model ?
+up. It would be great to have a step-by-step guide for the text-generation-webui (or any other similar tool) with this GGML model.
It doesn't work in text-generation-webui at this time. Hopefully it will in future.
But it will work in GPT4All-UI, using the ctransformers backend. Please see the GPT4All-UI repo.
For now, you can use this PR https://github.com/abetlen/llama-cpp-python/pull/251
cd text-generation-webui
conda activate textgen
pip uninstall llama-cpp-python
pip install git+https://github.com/gdedrouas/llama-cpp-python
For now, you can use this PR https://github.com/abetlen/llama-cpp-python/pull/251
No, that's not related to this. That's for supporting the latest quantisation methods for Llama models.
That does not add MPT model support.
git clone https://github.com/ggerganov/ggml
cd ggml
mkdir build
cd build
cmake ..
cmake --build . --config Release
bin/mpt -m /path/to/mpt-7b-storywriter.ggmlv3.q5_1.bin -t 8 -n 512 -p "Write a story about llamas"
for me works
Run it in langchain using CTransformers
from ctransformers.langchain import CTransformers
from langchain import LLMChain
from langchain.prompts import PromptTemplate
config = { 'temperature': 0.6, 'max_new_tokens': 6000, 'stream': True }
llm = CTransformers(model='/mnt/HC_Volume_32217551/models/MPT-7B-Storywriter-GGML/mpt-7b-storywriter.ggmlv3.q5_0.bin',
model_type='mpt', config=config)