Notebook to test Llama 2 in Colab free tier

#3
by r3gm - opened

@r3gm , Hii can you show an example for CPU basis also for Llama 2 13b models

@r3gm , any pointers on how to compile for Metal and run locally on an M2.thx

@r3gm , any pointers on how to compile for Metal and run locally on an M2.thx

follow this guide : https://llama-cpp-python.readthedocs.io/en/latest/install/macos/

@r3gm or @kroonen , stayed with ggml3 and 4.0 as recommended but get an Illegal Instruction: 4. Any suggestions?

(llama2-metal) R77NK6JXG7:llama2 venuvasudevan$ pip list|grep llama
llama-cpp-python 0.1.74

Screenshot 2023-07-23 at 10.19.30 AM.png

is there someway we could increase the context length? currently its 512 tokens. @r3gm

yes, you pass the arg n_ctx, like so:

llm = LlamaLLM(model_path="./models/7B/llama-2-7b-chat.ggmlv3.q4_K_M.bin", n_ctx=2048)

hey i tried running this again today and for some reason it doesn't seem to work when setting lcpp_llm properties and stuff it throws an attribution error i have tried running it in colab, kaggle and 3 different machines just doesn't seem to work
Screenshot 2023-09-16 142418.png

Sign up or log in to comment