cekal
/

mpt-7b-peft-compatible

Text Generation

StreamingDatasets

text-generation-inference

Model card Files Files and versions Community

cekal commited on May 27, 2023

Commit

c1cdc06

•

1 Parent(s): a677387

Update README.md

Files changed (1) hide show

README.md +3 -2

README.md CHANGED Viewed

@@ -18,7 +18,7 @@ This is MPT-7B patched so that it can be used with a LoRA. Note that while I tes
 Note that when using LoRA, there is a strange quirk that prevents me from causing generation with an empty prompt.
-I also included a model-agnostic export_hf_checkpoint.py script, which you can use to merge your lora back into a new full model. Once you do this, you do not need to use the patched version of the model code anymore. That being said, if you want to be able to load the model in 8bit you will still need it. The usage is `python export_hf_checkpoint.py <source> <lora> <dest>`.
 If you would like to use this with text-generation-webui, apply the following patch:
 ```
@@ -45,7 +45,7 @@ If you would like to use this with text-generation-webui, apply the following pa
 ```
 python server.py --model mosaicml_mpt-7b-instruct --trust-remote-code --load-in-8bit
 ```
-You may also need to patch bitsandbytes/nn/modules.py to prevent running out of VRAM when saving the LoRA:
 ```
 --- a/modules.py
 +++ b/modules.py
@@ -69,3 +69,4 @@ You may also need to patch bitsandbytes/nn/modules.py to prevent running out of
 The alterations are based on the source code for the llama model from HF Transformers.

 Note that when using LoRA, there is a strange quirk that prevents me from causing generation with an empty prompt.
+I also included a model-agnostic `export_hf_checkpoint.py` script, which you can use to merge your lora back into a new full model (check Github link at the end). Once you do this, you do not need to use the patched version of the model code anymore. That being said, if you want to be able to load the model in 8bit you will still need it. The usage is `python export_hf_checkpoint.py <source> <lora> <dest>`.
 If you would like to use this with text-generation-webui, apply the following patch:
 ```
 ```
 python server.py --model mosaicml_mpt-7b-instruct --trust-remote-code --load-in-8bit
 ```
+You may also need to patch `bitsandbytes/nn/modules.py` to prevent running out of VRAM when saving the LoRA:
 ```
 --- a/modules.py
 +++ b/modules.py
 The alterations are based on the source code for the llama model from HF Transformers.
+Big thanks to "iwalton3" for making this possible. You can find the `export_hf_checkpoint.py` here: https://github.com/iwalton3/mpt-lora-patch/blob/master/export_hf_checkpoint.py