TheBloke
/

stable-vicuna-13B-GGML

Model card Files Files and versions Community

Run with langchain

by ankurkaul17 - opened May 21, 2023

Discussion

ankurkaul17

May 21, 2023

Can this model be used with langchain llamacpp ? If so would you be kind enough to provide code. Thanks

TheBloke

Owner May 21, 2023

Yeah - install llama-cpp-python then here is a quick example:

from llama_cpp import Llama
import random
llm = Llama(model_path="/path/to/stable-vicuna-13B.ggmlv3.q5_1.bin", n_gpu_layers=40, seed=random.randint(1, 2**31))
tokens = llm.tokenize(b"### Human: Write a story about llamas\n### Assistant:")

output = b""
count = 0
for token in llm.generate(tokens, top_k=40, top_p=0.95, temp=0.72, repeat_penalty=1.1):
     text = llm.detokenize([token])
     print(text.decode(), end='', flush=True)
     output += text

     count +=1
     if count >= 500 or (token == llm.token_eos()):
         break

print("Full response:", output.decode())

ankurkaul17

May 21, 2023

•

edited May 21, 2023

Thanks for the code but getting a assertion error . Using llama-cpp-python == 0.1.52. Using the ggmlv3.q5_1 bin file.

assert self.ctx is not None
AssertionError

Would you know if this bin file is compatible with the package version. Thank you for your help

vsns

May 22, 2023

With langchain this https://github.com/marella/ctransformers could also be used, had issues with llama-cpp-python(asking for visual studio), but ctransformers (had libraries precompiled) helped. (I haven't tried this model with that though)

Satya93

May 22, 2023

Thanks for the code but getting a assertion error . Using llama-cpp-python == 0.1.52. Using the ggmlv3.q5_1 bin file.

assert self.ctx is not None
AssertionError

Would you know if this bin file is compatible with the package version. Thank you for your help

I had that same issue, and had to use the ggmlv2 version. I think you have to build the newer llama.cpp for the ggmlv3, but I could be wrong.

TheBloke

Owner May 22, 2023

llama-cpp-python got updated to support GGMLv3 about 10 hours ago. Version 0.1.53 supports GGMLv3.

You can install llama-cpp-python 0.1.53 on Windows without compiling with: pip install https://github.com/abetlen/llama-cpp-python/releases/download/v0.1.53/llama_cpp_python-0.1.53-cp310-cp310-win_amd64.whl

Or yes use ctransformers, which can be installed with pip install ctransformers

ankurkaul17

May 22, 2023

Thanks guys

vgkortsas

Jul 3, 2023

This comment has been hidden

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment