Edit model card

Model Details

Model Description

This is the model card for the SQLBot, a fine-tuned version of the Meta-Llama-3-8B model. The SQLBot is designed to generate SQL queries from natural language prompts. This model was fine-tuned on a dataset of synthetic text-to-SQL pairs provided by Gretel.ai.

Developed by: Rohit Patil
Shared by: Rohit Patil
Model type: Causal Language Model (Causal LM)
Language(s): English
License: Apache-2.0
Finetuned from model: meta-llama/Meta-Llama-3-8B

Model Sources

Repository: RohitPatill/SQLBot

Uses

Direct Use

The SQLBot can be directly used for generating SQL queries based on natural language descriptions of the desired query.

Downstream Use

  • Data Analysis: Assisting data analysts in generating SQL queries.
  • Educational Tools: Used in tools to teach SQL query formulation.

Out-of-Scope Use

  • Sensitive Data Queries: Should not be used to generate queries involving sensitive or personally identifiable information without proper safeguards.
  • Production Systems: The model might require further validation before being used in critical production systems due to potential inaccuracies.

Bias, Risks, and Limitations

The model is trained on synthetic data, which may not fully represent real-world SQL queries. Users should be aware of potential biases in the training data and should validate the generated queries before use.

Recommendations

  • Validation: Always validate the generated SQL queries before using them.
  • Awareness: Be aware of the potential biases and limitations of the model, especially when used in sensitive contexts.

How to Get Started with the Model

Use the following code to load and use the model:

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("RohitPatill/SQLBot")
model = AutoModelForCausalLM.from_pretrained("RohitPatill/SQLBot")

prompt = "Get all users who signed up in the last month."
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs)

print(tokenizer.decode(outputs[0]))

Training Details

Training Data

The model was fine-tuned on the "gretelai/synthetic_text_to_sql" dataset, which includes synthetic text-to-SQL pairs.

Training Procedure

Preprocessing

  • Tokenizer: AutoTokenizer from the transformers library
  • Padding: Right-side padding with EOS token

Training Hyperparameters

  • Training regime: Single epoch fine-tuning
  • Batch size: 2 (per device)
  • Gradient accumulation steps: 4
  • Warmup steps: 5
  • Max steps: 501
  • Learning rate: 2e-4
  • Optimizer: AdamW (8-bit)
  • Weight decay: 0.01
  • Scheduler: Linear
  • Seed: 3407

Evaluation

Testing Data

The evaluation was performed on a subset of the "gretelai/synthetic_text_to_sql" dataset.

Factors

  • Prompt variety: Various natural language prompts were used to assess the model's versatility.
  • Contextual accuracy: The accuracy of the generated SQL queries given specific contexts.

Metrics

  • Accuracy: The correctness of the generated SQL queries was measured against the expected outputs.
  • Efficiency: The computational efficiency was noted in terms of inference time per query.

Results

The fine-tuned SQLBot model demonstrated improved accuracy in generating SQL queries from natural language prompts.

Summary

The SQLBot model successfully bridges the gap between natural language understanding and SQL query generation, making it a useful tool for data analysts and educators.

Environmental Impact

Carbon emissions

  • Hardware Type: Google Colab T4 GPU (Free version)
  • Hours used: Approximately 1 hour
  • Cloud Provider: Google Cloud
  • Compute Region: Google Cloud Platform
  • Carbon Emitted: Approximately 0.05 kg CO2

Technical Specifications

Model Architecture and Objective

The model is based on the Meta-Llama-3-8B architecture, fine-tuned for the specific task of SQL query generation.

Compute Infrastructure

Hardware

  • GPUs used: Google Colab T4 GPU

Software

  • Transformers library
  • Datasets library
  • PyTorch
Downloads last month
7
Inference API
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.