Edit model card

LLaMaCoder

Model Description

LLaMaCoder is based on LLaMa2 7B language model, finetuned using LoRA adaptors.

Usage

Generate code with LLaMaCoder in 4bit model according to the following python snippet:

from transformers import AutoModelForCausalLM, BitsAndBytesConfig, AutoTokenizer
import torch

MODEL_NAME = "Sakuna/LLaMaCoderAll"
device = "cuda:0"


bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.float16,
)

model = AutoModelForCausalLM.from_pretrained(
    MODEL_NAME,
    quantization_config=bnb_config,
    trust_remote_code=True
)

tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME, trust_remote_code=True)
tokenizer.pad_token = tokenizer.eos_token

model = model.to(device)
model.eval()

prompt = "Write a Java program to calculate the factorial of a given number k"
input = f"{prompt}\n### Solution:\n"
device = "cuda:0"

inputs = tokenizer(input, return_tensors="pt").to(device)
outputs = model.generate(**inputs, max_length=256, temperature=0.7)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Downloads last month
4
Inference API
Model is too large to load in Inference API (serverless). To try the model, launch it on Inference Endpoints (dedicated) instead.

Dataset used to train Sakuna/LLaMaCoderAll