SQLForge — Qwen2.5-Coder-1.5B (text-to-SQL LoRA adapter)

A QLoRA adapter that fine-tunes Qwen/Qwen2.5-Coder-1.5B-Instruct to translate natural-language questions into SQL, trained on the Spider dataset.

Evaluated with execution accuracy — every generated query is run against the real SQLite database and the result set is compared to the gold query (not a fragile string match).

Results (full Spider dev set, 1034 examples)

	Execution accuracy	Crashing queries
Base Qwen2.5-Coder-1.5B (zero-shot)	57.45%	228
+ this adapter	65.57%	148
	+8.1 pts	−35%

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

base = "Qwen/Qwen2.5-Coder-1.5B-Instruct"
model = AutoModelForCausalLM.from_pretrained(base, device_map="auto")
model = PeftModel.from_pretrained(model, "Abdullahkousa2/sqlforge-qwen2.5-coder-1.5b")
tok = AutoTokenizer.from_pretrained(base)

messages = [
    {"role": "system", "content": "You are an expert data analyst. Given a SQLite "
     "database schema and a question, write a single valid SQLite SQL query that "
     "answers it. Respond with only the SQL query and nothing else."},
    {"role": "user", "content": 'Database schema:\nCREATE TABLE singer ("Name" text, "Age" int);\n\nQuestion: How many singers are there?'},
]
prompt = tok.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
out = model.generate(**tok(prompt, return_tensors="pt").to(model.device), max_new_tokens=128)
print(tok.decode(out[0], skip_special_tokens=True).split("assistant")[-1].strip())
# -> SELECT count(*) FROM singer

Or with the sqlforge package:

pip install sqlforge
sqlforge -q "How many singers are there?" --db mydata.sqlite --run

Training

Method: QLoRA — 4-bit NF4 base + LoRA (r=16, α=32, dropout=0.05) on all attention + MLP projections
Schedule: 3 epochs, lr 2e-4 cosine, effective batch size 16, bf16, paged AdamW 8-bit
Hardware: a single RTX 3070 (8GB)

Limitations

A 1.5B model. Its main failure is over-joining — building an unnecessary JOIN and referencing a column on the wrong table. Fine-tuning cut this by a third but didn't eliminate it. State-of-the-art (~90%) requires a frontier model inside an agentic pipeline; a locally-trained 1.5B realistically tops out in the 60s–70s.

Framework versions

PEFT 0.19.1 · TRL 1.5.1 · Transformers 4.57.6 · PyTorch 2.7.0+cu128

Downloads last month: 26

Model tree for Abdullahkousa2/sqlforge-qwen2.5-coder-1.5b

Base model

Qwen/Qwen2.5-1.5B

Finetuned

Qwen/Qwen2.5-Coder-1.5B

Finetuned

Qwen/Qwen2.5-Coder-1.5B-Instruct

Adapter

(117)

this model

Abdullahkousa2
/

sqlforge-qwen2.5-coder-1.5b