|
--- |
|
license: gemma |
|
library_name: transformers |
|
base_model: google/gemma-1.1-7b-it |
|
--- |
|
|
|
## Usage (llama-cli with GPU): |
|
``` |
|
llama-cli -m ./gemma-1.1-7b-it-Q6_K.gguf -ngl 100 --temp 0 --repeat-penalty 1.0 --color -p "Why is the sky blue?" |
|
``` |
|
|
|
## Usage (llama-cli with CPU): |
|
``` |
|
llama-cli -m ./gemma-1.1-7b-it-Q6_K.gguf --temp 0 --repeat-penalty 1.0 --color -p "Why is the sky blue?" |
|
``` |
|
|
|
## Usage (llama-cpp-python via Hugging Face Hub): |
|
``` |
|
from llama_cpp import Llama |
|
|
|
llm = Llama.from_pretrained( |
|
repo_id="chenghenry/gemma-1.1-7b-it-GGUF", |
|
filename="gemma-1.1-7b-it-Q6_K.gguf", |
|
n_ctx=8192, |
|
n_batch=2048, |
|
n_gpu_layers=100, |
|
verbose=False, |
|
chat_format="gemma" |
|
) |
|
|
|
prompt = "Why is the sky blue?" |
|
|
|
messages = [{"role": "user", "content": prompt}] |
|
response = llm.create_chat_completion( |
|
messages=messages, |
|
repeat_penalty=1.0, |
|
temperature=0) |
|
|
|
print(response["choices"][0]["message"]["content"]) |
|
``` |