Text Generation
Transformers
PyTorch
English
gpt_neox
Inference Endpoints
text-generation-inference

Help!: I Can't Convert RedPajama-INCITE-Chat-7B-v0.1 to ggml

#6
by Joseph717171 - opened

I have tried to convert RedPajama-INCITE-Chat-7B-v0.1 to ggml, using your packaged convert-path-to-ggml.py and convert.py, to no avail. Any help you can give would be greatly appreciated.

Joseph717171 changed discussion title from Help!: I Can't Convert RedPajama-INCITE-Chat-7B-v0.1 to sgml to Help!: I Can't Convert RedPajama-INCITE-Chat-7B-v0.1 to ggml
Together org

@biyuan can you help here?

Together org

Hi @Joseph717171 , would you like to try the following steps?

  1. Checkout our code at https://github.com/togethercomputer/redpajama.cpp.git;
  2. make redpajama-chat quantize-gptneox
  3. Create a new script like the following under /examples/redpajama/scripts/ (just a slight change of the current install script):
    Screenshot 2023-05-18 at 21.45.31.png
  4. Running this script by bash should be good to go. (I tested this on my machine.)

On the other hand, there exists some potential risk that this procedure can fail on a weak workstation: if the CPU RAM cannot hold the 7B model (about 14 GB), the script will exit. Currently, we do not have efficient support for running the converting in a restricted system budget.

Thanks for your help guys! It worked like a charm! πŸ€©πŸš€

Hi whats the inference speed you are getting for this ? is this any way supported with Langchain ?

Sign up or log in to comment