Edit model card

zsql-sqlite is a text-to-SQL model which is instruction tuned for SQL query synthesis on English language text to the SQLite SQL code. The model is trained on the ZeroLink DPO dataset.

This model is only capable of generating SQL queries and is designed to be further fine-tuned to specific database schemas.

Usage

You can run this model using the following code:

import transformers
from transformers import AutoTokenizer

model = "zerolink/zsql-en-sqlite"

tokenizer = AutoTokenizer.from_pretrained(model)

prompt = f"""
Using the schema:
CREATE TABLE Product (
    product_id INTEGER PRIMARY KEY,
    name TEXT NOT NULL,
    price DECIMAL NOT NULL,
    description TEXT
);

CREATE TABLE Customer (
    customer_id INTEGER PRIMARY KEY,
    name TEXT NOT NULL,
    email TEXT,
    phone TEXT
);
Generate SQL for the following question:
What are the prices and descriptions for all products that are greater than $5?
"""

system = "Translate English to SQLite SQL."
message = [
    {"role": "system", "content": system},
    {"role": "user", "content": prompt},
]

prompt = tokenizer.apply_chat_template(message, add_generation_prompt=True, tokenize=False)

# Create pipeline
pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer
)

# Generate text
sequences = pipeline(
    prompt,
    do_sample=True,
    temperature=0.1,
    top_p=0.9,
    num_return_sequences=1,
    max_length=1024,
)
print(sequences[0]['generated_text'])

Training hyperparameters

LoRA:

  • r=16
  • lora_alpha=16
  • lora_dropout=0.05
  • bias="none"
  • task_type="CAUSAL_LM"
  • target_modules=['k_proj', 'gate_proj', 'v_proj', 'up_proj', 'q_proj', 'o_proj', 'down_proj']

Training arguments:

  • per_device_train_batch_size=4
  • gradient_accumulation_steps=4
  • gradient_checkpointing=True
  • learning_rate=5e-5
  • lr_scheduler_type="linear"
  • max_steps=200
  • optim="paged_adamw_32bit"
  • warmup_steps=100

DPOTrainer:

  • beta=0.1
  • max_prompt_length=4096
  • max_length=3516
Downloads last month
78
Safetensors
Model size
7.24B params
Tensor type
BF16
·
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train zerolink/zsql-en-sqlite

Collection including zerolink/zsql-en-sqlite