LlaMa 2 7b 4-bit Python Coder (GPTQ)👩‍💻

LlaMa-2 7b fine-tuned on the python_code_instructions_18k_alpaca Code instructions dataset by using the method QLoRA in 4-bit with PEFT library.

Pretrained description

Llama-2

Meta developed and publicly released the Llama 2 family of large language models (LLMs), a collection of pretrained and fine-tuned generative text models ranging in scale from 7 billion to 70 billion parameters.

Model Architecture Llama 2 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align to human preferences for helpfulness and safety

Training data

python_code_instructions_18k_alpaca

The dataset contains problem descriptions and code in python language. This dataset is taken from sahil2801/code_instructions_120k, which adds a prompt column in alpaca style.

Framework versions

PEFT 0.4.0

Example of usage

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
# Load quantized model and tokenizer.
tokenizer = AutoTokenizer.from_pretrained("NurtureAI/llama-2-7b-int4-gptq-python")
model = AutoModelForCausalLM.from_pretrained(
  "NurtureAI/llama-2-7b-int4-gptq-python",
  load_in_4bit=True,
  torch_dtype=torch.float16,
  device_map=device_map,
)
# prepare prompt.
instruction="Write a Python function to display the first and last elements of a list."
prompt = f"""### Instruction:
Use the Task below and the Input given to write the Response, which is a programming code that can solve the Task.

### Task:
{instruction}

### Input:


### Response:
"""
# generate response.
input_ids = tokenizer(prompt, return_tensors="pt", truncation=True).input_ids.cuda()
# with torch.inference_mode():
outputs = model.generate(input_ids=input_ids, max_new_tokens=100, do_sample=True, top_p=0.9,temperature=0.5)
print(f"Prompt:\n{prompt}\n")
print(f"Generated instruction:\n{tokenizer.batch_decode(outputs.detach().cpu().numpy(), skip_special_tokens=True)[0][len(prompt):]}")

Citation

@misc {NurtureAI,
    author       = { Raymond Hernandez },
    title        = { NurtureAI/llama-2-7b-int4-gptq-python },
    year         = { 2023 },
    url          = { https://huggingface.co/NurtureAI/llama-2-7b-int4-gptq-python },
    publisher    = { Hugging Face }
}

NurtureAI
/

llama-2-7b-int4-gptq-python