--- license: bsd datasets: - ManthanKulakarni/Text2JQL_v2 language: - en pipeline_tag: text-generation tags: - LLaMa - JQL - Jira - GGML - GGML-q8_0 - GPU - CPU - 7B - llama.cpp - text-generation-webui --- GGML files are for CPU + GPU inference using [llama.cpp](https://github.com/ggerganov/llama.cpp) ## How to run in `llama.cpp` ``` ./main -t 10 -ngl 32 -m ggml-model-q8_0.bin --color -c 2048 --temp 0.7 --repeat_penalty 1.1 -n -1 -p "### Instruction: Write JQL(Jira query Language) for give input ### Input: stories assigned to manthan which are created in last 10 days with highest priority and label is set to release ### Response:" ``` Change `-t 10` to the number of physical CPU cores you have. For example if your system has 8 cores/16 threads, use `-t 8`. Change `-ngl 32` to the number of layers to offload to GPU. Remove it if you don't have GPU acceleration. Tto have a chat-style conversation, replace the `-p ` argument with `-i -ins` ## How to run in `text-generation-webui` Further instructions here: [text-generation-webui/docs/llama.cpp-models.md](https://github.com/oobabooga/text-generation-webui/blob/main/docs/llama.cpp-models.md). ## How to run using `LangChain` ##### Instalation on CPU ``` pip install llama-cpp-python ``` ##### Instalation on GPU ``` CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python ``` ```python from langchain.llms import LlamaCpp from langchain import PromptTemplate, LLMChain from langchain.callbacks.manager import CallbackManager from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler n_gpu_layers = 40 # Change this value based on your model and your GPU VRAM pool. n_batch = 512 # Should be between 1 and n_ctx, consider the amount of VRAM in your GPU. n_ctx=2048 callback_manager = CallbackManager([StreamingStdOutCallbackHandler()]) # Make sure the model path is correct for your system! llm = LlamaCpp( model_path="./ggml-model-q8_0.bin", n_gpu_layers=n_gpu_layers, n_batch=n_batch, callback_manager=callback_manager, verbose=True, n_ctx=n_ctx ) llm("""### Instruction: Write JQL(Jira query Language) for give input ### Input: stories assigned to manthan which are created in last 10 days with highest priority and label is set to release ### Response:""") ``` For more information refer [LangChain](https://python.langchain.com/docs/modules/model_io/models/llms/integrations/llamacpp)