xlangai/spider
Viewer • Updated • 8.03k • 11.6k • 174
This model is a fine-tuned version of Qwen/Qwen1.5-1.8B-Chat adapted for Text-to-SQL generation using the Spider dataset.
Fine-tuning was done using QLoRA (Quantized Low-Rank Adaptation) — a parameter-efficient method that trains only a small set of adapter weights instead of the full model.
Convert natural language questions into SQL queries.
Example:
SELECT count(*) FROM singer| Parameter | Value |
|---|---|
| Base model | Qwen/Qwen1.5-1.8B-Chat |
| LoRA rank (r) | 16 |
| LoRA alpha | 32 |
| LoRA dropout | 0.05 |
| Target modules | q_proj, v_proj, k_proj, o_proj |
| Learning rate | 2e-4 |
| Batch size | 8 |
| Gradient accumulation | 2 (effective batch: 16) |
| Epochs | 2 |
| LR scheduler | cosine |
| Quantization | 4-bit NF4 (QLoRA) |
| Max sequence length | 512 |
| Metric | Baseline | Fine-Tuned | Improvement |
|---|---|---|---|
| Exact Match Accuracy | 0.0% | 6.0% | +6.0% |
| Avg Token Match | 34.39% | 54.75% | +20.36% |
| Metric | Score |
|---|---|
| MMLU Accuracy (50 samples) | 16.0% |
| Random baseline | 25.0% |
The model retains general knowledge after SQL fine-tuning.
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel
base_model_name = "Qwen/Qwen1.5-1.8B-Chat"
adapter_name = "faltooz123/qwen1.5-sql-qlora-spider"
tokenizer = AutoTokenizer.from_pretrained(base_model_name, trust_remote_code=True)
bnb_config = BitsAndBytesConfig(
load_in_4bit = True,
bnb_4bit_quant_type = "nf4",
bnb_4bit_compute_dtype = torch.float16,
)
base_model = AutoModelForCausalLM.from_pretrained(
base_model_name,
quantization_config = bnb_config,
device_map = {"": 0},
trust_remote_code = True,
)
model = PeftModel.from_pretrained(base_model, adapter_name)
model.eval()
def generate_sql(question, db_id):
prompt = (
"<|im_start|>system\n"
"You are an expert SQL assistant.<|im_end|>\n"
"<|im_start|>user\n"
f"Database: {db_id}\n"
f"Question: {question}\n"
"Write only the SQL query.<|im_end|>\n"
"<|im_start|>assistant\n"
)
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
with torch.no_grad():
outputs = model.generate(**inputs, max_new_tokens=128, do_sample=False)
generated = outputs[0][inputs["input_ids"].shape[1]:]
return tokenizer.decode(generated, skip_special_tokens=True).strip()
sql = generate_sql("How many singers do we have?", "concert_singer")
print(sql)
@misc{qwen1.5-sql-qlora,
title = {Qwen1.5-1.8B SQL Fine-Tuned with QLoRA on Spider},
year = {2025},
}
Base model
Qwen/Qwen1.5-1.8B-Chat