Qwen3.5-0.8B Text2SQL
Supervised Fine-Tuning (SFT) for Natural Language to SQL Generation
Fine-tuning Qwen3.5-0.8B using Spider, BIRD23, and SynSQL-2.5M datasets with QLoRA + Unsloth.
Repository Project: https://github.com/MuhammadNafishZaldinanda/finetuning-text2sql
Dataset
Dialect: SQLite
SynSQL-2.5M Filtering Configuration
| Criteria |
Value |
| Question Style |
Formal, Colloquial, Imperative, Interrogative, Descriptive, Concise |
| Simple |
700 |
| Moderate |
2,800 |
| Complex |
2,800 |
| Highly Complex |
700 |
| Total Samples |
7,000 |
Instruction Prompt
Task Overview:
You are a data science expert. Below, you are provided with a database schema and a natural language question. Your task is to understand the schema and generate a valid SQL query to answer the question.
Database Engine:
SQLite
Database Schema:
{db_details}
This schema describes the database's structure, including tables, columns, primary keys, foreign keys, and any relevant relationships or constraints.
Question:
{evidence}{question}
Instructions:
- Make sure you only output the information that is asked in the question. If the question asks for a specific column, make sure to only include that column in the SELECT clause, nothing more.
- The generated query should return all of the information asked in the question without any missing or extra information.
- Before generating the final SQL query, please think through the steps of how to write the query.
Output Format:
In your answer, please enclose the generated SQL query in a code block:
```sql
-- Your SQL query
```
Take a deep breath and think step by step to find the correct SQL query.
LoRA Configuration
| Parameter |
Value |
| Quantization |
4-bit |
| LoRA Rank (r) |
32 |
| LoRA Alpha |
64 |
| LoRA Dropout |
0.0 |
| Bias |
none |
| Trainable Parameters |
12.78M |
| Percentage of Trainable Parameters |
2.22% |
| Target Modules |
q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj |
Training Configuration
| Parameter |
Value |
| Base Model |
Qwen3.5-0.8B |
| Total Dataset |
20626 |
| Epoch |
1 |
| Max Sequence Length |
8704 |
| Learning Rate |
1e-5 |
| Scheduler |
Cosine |
| Warmup Ratio |
10% |
| Optimizer |
adam_torch_fused |
| Max Gradient Norm |
0.5 |
| Batch Size |
1 |
| Gradient Accumulation Steps |
8 |
| Hardware |
NVIDIA RTX 4000 SFF Ada |
| Available VRAM |
20 GB |
| Peak VRAM Usage |
~19 GB |
| Training Time |
7 Hours 36 Minutes |
Training Results
| Metric |
Value |
| Final Train Loss |
0.262 |
| Final Validation Loss |
0.218 |
Model Performance Evaluation: Base vs. Fine-Tuned (Qwen3.5-0.8B)
1. Base Model (Qwen3.5-0.8B)
Overall Performance
| Metric |
Value |
| Accuracy |
21.3% |
| Correct |
106 |
| Wrong |
152 |
| Execution Error |
240 |
Performance by Difficulty
| Difficulty |
Correct / Total |
Accuracy |
| Simple |
51 / 148 |
34.5% |
| Moderate |
47 / 250 |
18.8% |
| Challenging |
8 / 102 |
7.8% |
2. Fine-Tuned Model (QLoRA)
Overall Performance
| Metric |
Value |
| Accuracy |
18.3% |
| Correct |
91 |
| Wrong |
171 |
| Execution Error |
236 |
Performance by Difficulty
| Difficulty |
Correct / Total |
Accuracy |
| Simple |
57 / 148 |
38.5% |
| Moderate |
26 / 250 |
10.4% |
| Challenging |
8 / 102 |
7.8% |
3. Head-to-Head Comparison
| Metric |
Base Model |
Fine-Tuned (QLoRA) |
Selisih |
| Overall Accuracy |
21.3% |
18.3% |
-3.0% |
| Simple |
34.5% |
38.5% |
+4.0% |
| Moderate |
18.8% |
10.4% |
-8.4% |
| Challenging |
7.8% |
7.8% |
0.0% |
| Execution Error |
240 |
236 |
-4 |