TheBloke
/

mpt-30B-instruct-GGML

text-generation-inference

Model card Files Files and versions Community

TheBloke commited on Jun 22, 2023

Commit

bb2e6f5

•

1 Parent(s): 235e5e6

Update README.md

Files changed (1) hide show

README.md +6 -0

README.md CHANGED Viewed

@@ -39,6 +39,12 @@ Please note that these GGMLs are **not compatible with llama.cpp, or currently w
 [KoboldCpp](https://github.com/LostRuins/koboldcpp) just added GPU accelerated (OpenCL) support for MPT models, so that is the client I recommend using for these models.
 ## Repositories available
 * [2, 3, 4, 5, 6 and 8-bit GGML models for CPU+GPU inference](https://huggingface.co/TheBloke/mpt-30B-instruct-GGML)

 [KoboldCpp](https://github.com/LostRuins/koboldcpp) just added GPU accelerated (OpenCL) support for MPT models, so that is the client I recommend using for these models.
+**Note**: There is currently a bug with loading this model in KoboldCpp Release 1.32: it will wrongly detect it as a GPT-NeoX model.
+To resolve this, add argument `--forceversion 500`
+This should be fixed in the next release of KoboldCpp, so if you are running a version later than 1.32 it should not be necessary.
 ## Repositories available
 * [2, 3, 4, 5, 6 and 8-bit GGML models for CPU+GPU inference](https://huggingface.co/TheBloke/mpt-30B-instruct-GGML)