faizack
/

text-to-sql-dpo

+---
+base_model: unsloth/llama-3-8B
+library_name: peft
+pipeline_tag: text-generation
+tags:
+- text-to-sql
+- dpo
+- lora
+- transformers
+- trl
+- sql-generation
+- database
+---
+# Text-to-SQL DPO Model
+A Direct Preference Optimization (DPO) fine-tuned LLaMA-3-8B model specialized for text-to-SQL generation tasks. This model has been trained using LoRA (Low-Rank Adaptation) for efficient parameter-efficient fine-tuning.
+## Model Details
+### Model Description
+This model is a fine-tuned version of LLaMA-3-8B using Direct Preference Optimization (DPO) specifically for text-to-SQL tasks. It has been trained on preference pairs to generate accurate SQL queries from natural language descriptions.
+- **Developed by:** faizack
+- **Model type:** Causal Language Model with LoRA adapter
+- **Language(s) (NLP):** English
+- **License:** Apache 2.0 (inherited from base model)
+- **Finetuned from model:** unsloth/llama-3-8B
+### Model Sources
+- **Repository:** [Text-to-SQL DPO Repository](https://github.com/IDEAS-Incubator/text-to-sql_DPO)
+- **Base Model:** [unsloth/llama-3-8B](https://huggingface.co/unsloth/llama-3-8B)
+## Uses
+### Direct Use
+This model is designed for generating SQL queries from natural language descriptions. It can be used for:
+- Converting natural language questions to SQL queries
+- Database query generation
+- Text-to-SQL applications
+- Database interaction interfaces
+### Example Usage
+```python
+from transformers import AutoTokenizer, AutoModelForCausalLM
+from peft import PeftModel
+import torch
+# Load the base model and tokenizer
+base_model = "unsloth/llama-3-8B"
+tokenizer = AutoTokenizer.from_pretrained(base_model)
+model = AutoModelForCausalLM.from_pretrained(base_model, torch_dtype=torch.float16)
+# Load the LoRA adapter
+model = PeftModel.from_pretrained(model, "faizack/text-to-sql-dpo")
+# Generate SQL query
+prompt = "Show me all users from the customers table"
+inputs = tokenizer(prompt, return_tensors="pt")
+outputs = model.generate(**inputs, max_length=100)
+response = tokenizer.decode(outputs[0], skip_special_tokens=True)
+print(response)
+```
+### Out-of-Scope Use
+This model should not be used for:
+- General-purpose text generation beyond SQL queries
+- Generating malicious or harmful SQL queries
+- Database operations without proper validation
+- Production use without proper testing and validation
+## Bias, Risks, and Limitations
+### Limitations
+- The model is specialized for SQL generation and may not perform well on other tasks
+- Generated SQL queries should be validated before execution
+- Performance may vary depending on database schema complexity
+- The model may generate queries that are syntactically correct but logically incorrect
+### Recommendations
+- Always validate generated SQL queries before execution
+- Test the model on your specific database schema
+- Use appropriate safety measures when executing generated queries
+- Consider the model's limitations when integrating into production systems
+## How to Get Started with the Model
+### Installation
+```bash
+pip install transformers peft torch
+```
+### Quick Start
+```python
+from transformers import AutoTokenizer, AutoModelForCausalLM
+from peft import PeftModel
+# Load model and adapter
+base_model = "unsloth/llama-3-8B"
+model = AutoModelForCausalLM.from_pretrained(base_model)
+model = PeftModel.from_pretrained(model, "faizack/text-to-sql-dpo")
+tokenizer = AutoTokenizer.from_pretrained(base_model)
+# Generate SQL
+prompt = "Find all orders placed in the last 30 days"
+inputs = tokenizer(prompt, return_tensors="pt")
+outputs = model.generate(**inputs, max_length=150, temperature=0.1)
+sql_query = tokenizer.decode(outputs[0], skip_special_tokens=True)
+print(sql_query)
+```
+## Training Details
+### Training Data
+The model was trained on the `zerolink/zsql-sqlite-dpo` dataset, which contains preference pairs for text-to-SQL tasks.
+### Training Procedure
+#### Training Hyperparameters
+- **Training regime:** DPO (Direct Preference Optimization)
+- **Epochs:** 6
+- **Batch size:** 2
+- **Gradient accumulation:** 32
+- **Learning rate:** 5e-5
+- **LoRA rank:** 16
+- **LoRA alpha:** 16
+- **LoRA dropout:** 0.05
+- **Target modules:** q_proj, v_proj
+#### Training Infrastructure
+- **Base model:** unsloth/llama-3-8B
+- **Framework:** PEFT (Parameter-Efficient Fine-Tuning)
+- **Training method:** LoRA (Low-Rank Adaptation)
+- **Total steps:** 120
+- **Steps per epoch:** 3660
+## Technical Specifications
+### Model Architecture
+- **Base architecture:** LLaMA-3-8B
+- **Adapter type:** LoRA
+- **Trainable parameters:** ~16M (LoRA adapter only)
+- **Total parameters:** ~8B (base model + adapter)
+### Compute Infrastructure
+- **Hardware:** GPU-based training
+- **Framework versions:**
+  - PEFT: 0.17.1
+  - Transformers: 4.56.2
+  - PyTorch: Compatible with CUDA
+## Citation
+If you use this model in your research, please cite:
+```bibtex
+@misc{text-to-sql-dpo-2024,
+  title={Text-to-SQL DPO Model},
+  author={faizack},
+  year={2024},
+  url={https://huggingface.co/faizack/text-to-sql-dpo}
+}
+```
+## Model Card Contact
+For questions or issues related to this model, please contact the model author or open an issue in the repository.
+## Framework versions
+- PEFT 0.17.1
+- Transformers 4.56.2