--- library_name: transformers license: llama2 pipeline_tag: text-generation tags: - GGUF - llama-2 - llama - meta - facebook - quantized - 7b --- # Model Card for alokabhishek/Llama-2-7b-chat-hf-GGUF This repo GGUF quantized version of Meta's meta-llama/Llama-2-7b-chat-hf model using llama.cpp. ## Model Details - Model creator: [Meta](https://huggingface.co/meta-llama) - Original model: [Llama-2-7b-chat-hf](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf) ### About GGUF quantization using llama.cpp - llama.cpp github repo: [llama.cpp github repo](https://github.com/ggerganov/llama.cpp) # How to Get Started with the Model Use the code below to get started with the model. ## How to run from Python code #### First install the package ```shell # Base ctransformers with CUDA GPU acceleration ! pip install ctransformers[cuda]>=0.2.24 # Or with no GPU acceleration # ! pip install ctransformers>=0.2.24 ! pip install -U sentence-transformers ! pip install transformers huggingface_hub torch ``` # Import ```python from ctransformers import AutoModelForCausalLM from transformers import pipeline, AutoModel, AutoTokenizer from sentence_transformers import SentenceTransformer import os ``` # Use a pipeline as a high-level helper ```python # Load LLM and Tokenizer model_llama = AutoModelForCausalLM.from_pretrained( "alokabhishek/Llama-2-7b-chat-hf-GGUF", model_file="llama-2-7b-chat-hf.Q4_K_M.gguf", # replace Q4_K_M.gguf with Q5_K_M.gguf as needed model_type="llama", gpu_layers=50, # Use `gpu_layers` to specify how many layers will be offloaded to the GPU. hf=True ) tokenizer_llama = AutoTokenizer.from_pretrained( "alokabhishek/Llama-2-7b-chat-hf-GGUF", use_fast=True ) # Create a pipeline pipe_llama = pipeline(model=model_llama, tokenizer=tokenizer_llama, task='text-generation') prompt_llama = "Tell me a funny joke about Large Language Models meeting a Blackhole in an intergalactic Bar." output_llama = pipe_llama(prompt_llama, max_new_tokens=512) print(output_llama[0]["generated_text"]) ``` ## Uses ### Direct Use [More Information Needed] ### Downstream Use [optional] [More Information Needed] ### Out-of-Scope Use [More Information Needed] ## Bias, Risks, and Limitations [More Information Needed] ## Evaluation ### Testing Data, Factors & Metrics #### Testing Data [More Information Needed] #### Factors [More Information Needed] #### Metrics [More Information Needed] ### Results [More Information Needed] ## Model Card Authors [optional] [More Information Needed] ## Model Card Contact [More Information Needed]