TheBloke/MPT-7B-GGML · Python library

May 21, 2023

Hi, thanks for uploading the models.

I created Python bindings for the GGML models https://github.com/marella/ctransformers which currently supports MPT, LLaMA and many other models (see Supported Models).

It provides a unified interface for all models, supports LangChain and can be used with Hugging Face Hub models:

from ctransformers import AutoModelForCausalLM

llm = AutoModelForCausalLM.from_pretrained('TheBloke/MPT-7B-GGML', model_type='mpt', model_file='mpt-7b.ggmlv3.q4_0.bin')

print(llm('AI is going to'))

If you add config.json file to your model repo with { "model_type": "mpt" } then model_type can be omitted:

llm = AutoModelForCausalLM.from_pretrained('TheBloke/MPT-7B-GGML', model_file='mpt-7b.ggmlv3.q4_0.bin')

If there is only one model file in repo then model_file can be omitted:

llm = AutoModelForCausalLM.from_pretrained('TheBloke/MPT-7B-GGML')

Please see marella/gpt-2-ggml for reference.

TheBloke

Owner May 21, 2023

Oh wonderful, this looks really great @marella . I will update the README to point people to this great tool.

Does it support CUDA offloading?

marella

May 21, 2023

Currently it doesn't support CUDA but I'm planning to add it in future after adding few more features and models.

TheBloke

Owner May 21, 2023

OK thanks. I know I'm going to get a lot of people asking about that. Like this guy: https://huggingface.co/TheBloke/MPT-7B-GGML/discussions/1#6469176ab2321e47d3288721

I've added your library and GPT4All-UI to the READMEs for my three MPT models