CodeLlama-7B Solidity β€” QLoRA Fine-Tuned Adapter (deployable export)

This repository contains a LoRA / PEFT adapter (not a full model) fine-tuned with QLoRA (4-bit base, bitsandbytes) on top of AlfredPros/CodeLlama-7b-Instruct-Solidity.

This is a clean, inference-ready export of the final fine-tune β€” it contains only the adapter, tokenizer, and config (no optimizer / scheduler / RNG / trainer state), so it is meant for inference / deployment, not for resuming training.

Model Details

Property Value
Base model AlfredPros/CodeLlama-7b-Instruct-Solidity (CodeLlama-7B, Solidity-tuned)
Fine-tuning method QLoRA (4-bit base) β†’ LoRA adapter
Adapter type LoRA
PEFT version 0.14.0
Task type CAUSAL_LM
Rank (r) 64
lora_alpha 16
lora_dropout 0.1
Target modules q_proj, v_proj
Bias none
Tokenizer CodeLlamaTokenizerFast
Adapter size ~134 MB (adapter_model.safetensors)

Note: 4-bit quantization is a training/loading-time setting (bitsandbytes) and is not recorded in adapter_config.json. The adapter can be applied to the base model loaded in 4-bit, 8-bit, fp16, or bf16.

Files in this repository

File Purpose
adapter_config.json LoRA/PEFT configuration
adapter_model.safetensors LoRA adapter weights (~134 MB)
tokenizer.json, tokenizer_config.json, special_tokens_map.json Tokenizer
training_args.bin Serialized TrainingArguments from the run

Note: The ~13 GB base model weights are not included β€” they are pulled separately from AlfredPros/CodeLlama-7b-Instruct-Solidity. The training dataset and script are not included. Optimizer/scheduler/RNG state are not present, so this export cannot resume training; use a full checkpoint folder for that.

How to use (inference)

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

BASE = "AlfredPros/CodeLlama-7b-Instruct-Solidity"
ADAPTER = "Mukesh0606/solidity-codellama-qlora-r64"   # or a local path to this folder

tokenizer = AutoTokenizer.from_pretrained(ADAPTER)
base_model = AutoModelForCausalLM.from_pretrained(BASE, device_map="auto")
model = PeftModel.from_pretrained(base_model, ADAPTER)
model.eval()

prompt = "// Write a secure ERC20 token contract in Solidity\n"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
out = model.generate(**inputs, max_new_tokens=512, do_sample=False)
print(tokenizer.decode(out[0], skip_special_tokens=True))

Load with a 4-bit base (matches QLoRA training, low VRAM)

from transformers import BitsAndBytesConfig
import torch

bnb = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.float16,
    bnb_4bit_use_double_quant=True,
)
base_model = AutoModelForCausalLM.from_pretrained(BASE, quantization_config=bnb, device_map="auto")
model = PeftModel.from_pretrained(base_model, ADAPTER)

Merge the adapter into a standalone model (optional)

# Merge requires the base in fp16/bf16 (not 4-bit).
base_fp16 = AutoModelForCausalLM.from_pretrained(BASE, torch_dtype="float16", device_map="auto")
merged = PeftModel.from_pretrained(base_fp16, ADAPTER).merge_and_unload()
merged.save_pretrained("codellama-7b-solidity-merged")
tokenizer.save_pretrained("codellama-7b-solidity-merged")

Hardware notes

  • Inference (4-bit): ~6 GB GPU.
  • Inference (fp16): ~16 GB GPU.

Framework versions

  • PEFT 0.14.0
  • Transformers (Hugging Face Trainer)
  • Base: CodeLlama-7B-Instruct (Solidity-tuned)

License

Inherits the base model's license (Llama 2 Community License via CodeLlama). Review the base model card before commercial use.

Downloads last month
13
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for Mukesh0606/solidity-codellama-qlora-r64

Adapter
(9)
this model