CodeLlama-7B Solidity — QLoRA Fine-Tuned Adapter (deployable export)

This repository contains a LoRA / PEFT adapter (not a full model) fine-tuned with QLoRA (4-bit base, bitsandbytes) on top of AlfredPros/CodeLlama-7b-Instruct-Solidity.

This is a clean, inference-ready export of the final fine-tune — it contains only the adapter, tokenizer, and config (no optimizer / scheduler / RNG / trainer state), so it is meant for inference / deployment, not for resuming training.

Model Details

Property	Value
Base model	`AlfredPros/CodeLlama-7b-Instruct-Solidity` (CodeLlama-7B, Solidity-tuned)
Fine-tuning method	QLoRA (4-bit base) → LoRA adapter
Adapter type	LoRA
PEFT version	0.14.0
Task type	`CAUSAL_LM`
Rank (`r`)	64
`lora_alpha`	16
`lora_dropout`	0.1
Target modules	`q_proj`, `v_proj`
Bias	none
Tokenizer	`CodeLlamaTokenizerFast`
Adapter size	~134 MB (`adapter_model.safetensors`)

Note: 4-bit quantization is a training/loading-time setting (bitsandbytes) and is not recorded in adapter_config.json. The adapter can be applied to the base model loaded in 4-bit, 8-bit, fp16, or bf16.

Files in this repository

File	Purpose
`adapter_config.json`	LoRA/PEFT configuration
`adapter_model.safetensors`	LoRA adapter weights (~134 MB)
`tokenizer.json`, `tokenizer_config.json`, `special_tokens_map.json`	Tokenizer
`training_args.bin`	Serialized `TrainingArguments` from the run

Note: The ~13 GB base model weights are not included — they are pulled separately from AlfredPros/CodeLlama-7b-Instruct-Solidity. The training dataset and script are not included. Optimizer/scheduler/RNG state are not present, so this export cannot resume training; use a full checkpoint folder for that.

How to use (inference)

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

BASE = "AlfredPros/CodeLlama-7b-Instruct-Solidity"
ADAPTER = "Mukesh0606/solidity-codellama-qlora-r64"   # or a local path to this folder

tokenizer = AutoTokenizer.from_pretrained(ADAPTER)
base_model = AutoModelForCausalLM.from_pretrained(BASE, device_map="auto")
model = PeftModel.from_pretrained(base_model, ADAPTER)
model.eval()

prompt = "// Write a secure ERC20 token contract in Solidity\n"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
out = model.generate(**inputs, max_new_tokens=512, do_sample=False)
print(tokenizer.decode(out[0], skip_special_tokens=True))

Load with a 4-bit base (matches QLoRA training, low VRAM)

from transformers import BitsAndBytesConfig
import torch

bnb = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.float16,
    bnb_4bit_use_double_quant=True,
)
base_model = AutoModelForCausalLM.from_pretrained(BASE, quantization_config=bnb, device_map="auto")
model = PeftModel.from_pretrained(base_model, ADAPTER)

Merge the adapter into a standalone model (optional)

# Merge requires the base in fp16/bf16 (not 4-bit).
base_fp16 = AutoModelForCausalLM.from_pretrained(BASE, torch_dtype="float16", device_map="auto")
merged = PeftModel.from_pretrained(base_fp16, ADAPTER).merge_and_unload()
merged.save_pretrained("codellama-7b-solidity-merged")
tokenizer.save_pretrained("codellama-7b-solidity-merged")

Hardware notes

Inference (4-bit): ~6 GB GPU.
Inference (fp16): ~16 GB GPU.

Framework versions

PEFT 0.14.0
Transformers (Hugging Face Trainer)
Base: CodeLlama-7B-Instruct (Solidity-tuned)

License

Inherits the base model's license (Llama 2 Community License via CodeLlama). Review the base model card before commercial use.

Downloads last month: 13

Model tree for Mukesh0606/solidity-codellama-qlora-r64

Base model

AlfredPros/CodeLlama-7b-Instruct-Solidity

Adapter

(9)

this model