Problem Statement
The objective of fine-tuning the CodeLlama model for the SQL dataset was to enhance its ability to generate accurate SQL queries based on given prompts. The initial model performance was suboptimal, prompting the need for fine-tuning to improve query generation accuracy.
Dataset and Model
- Dataset: ChrisHayduk/Llama-2-SQL-Dataset
- Model: CodeLlama, a pre-trained language model tailored for code generation tasks, was chosen as the base model for fine-tuning. The model's architecture and extensive pre-training on code-related tasks provided a strong foundation for this specialized fine-tuning.
Training Procedure
The fine-tuning process involved several key steps and configurations to optimize model performance:
Library and Tools: PEFT was utilized for the fine-tuning process, leveraging its capabilities to adjust model parameters and improve inference accuracy.
Quantization Configuration: The
bitsandbytes
quantization method was employed with the following configuration:quant_method: bitsandbytes load_in_8bit: False load_in_4bit: True llm_int8_threshold: 6.0 llm_int8_skip_modules: None llm_int8_enable_fp32_cpu_offload: False llm_int8_has_fp16_weight: False bnb_4bit_quant_type: nf4 bnb_4bit_use_double_quant: True bnb_4bit_compute_dtype: float32
Result
The fine-tuning process aimed to address the initial performance issues observed in the model's ability to generate SQL queries accurately. Post-fine-tuning evaluation involved measuring the model's accuracy, speed of query generation, and overall efficiency in handling various query complexities.
Code
import torch
from transformers import AutoModelForCausalLM, BitsAndBytesConfig, AutoTokenizer
from peft import PeftModel
base_model = "codellama/CodeLlama-7b-hf"
model = AutoModelForCausalLM.from_pretrained(
base_model,
#load_in_4bit=True,
#torch_dtype=torch.float16,
device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained("codellama/CodeLlama-7b-hf")
output_dir = "model/output/path"
model = PeftModel.from_pretrained(model, output_dir)
eval_prompt = """You are a powerful text-to-SQL model. Your job is to answer questions about a database and explain your answer in detail. You are given a question and context regarding one or more tables.
You must output the SQL query that answers the question.
### Input:
Which Class has a Frequency MHz larger than 91.5, and a City of license of hyannis, nebraska?
### Context:
CREATE TABLE table_name_12 (class VARCHAR, frequency_mhz VARCHAR, city_of_license VARCHAR)
### Response:
### Explain:
"""
model_input = tokenizer(eval_prompt, return_tensors="pt").to("cuda")
model.eval()
with torch.no_grad():
print(tokenizer.decode(model.generate(**model_input, max_new_tokens=100)[0], skip_special_tokens=True))