README.md · richardr1126/spider-skeleton-wizard-coder-qlora at 281a79b19c39f0cb2b096705137634139a444558

metadata

license: bigcode-openrail-m
library_name: peft
tags:
  - sql
  - spider
datasets:
  - spider
  - richardr1126/spider-skeleton-context-instruct
base_model: WizardLM/WizardCoder-15B-V1.0

Spider Skeleton Wizard Coder QLoRA Adapter Summary

This QLoRa adapter was created by finetuning WizardLM/WizardCoder-15B-V1.0 on an enhanced Spider context training dataset: richardr1126/spider-skeleton-context-instruct.
Finetuning was performed using QLoRa on a single RTX6000 48GB.
If you want the merged model: richardr1126/spider-skeleton-wizard-coder-merged

Spider Dataset

Spider is a large-scale complex and cross-domain semantic parsing and text-to-SQL dataset annotated by 11 Yale students The goal of the Spider challenge is to develop natural language interfaces to cross-domain databases.

This dataset was used to finetune this model.

Project Description

This project aims to use off-the-shelf large language models for text-to-SQL program sysnthesis tasks. After experimenting with various models, fine-tuning hyperparameters, and training datasets an optimal solution was identified by fine-tuning the WizardLM/WizardCoder-15B-V1.0 base model using QLoRA techniques on this customized Spider training dataset. The resultant model, richardr1126/spider-skeleton-wizard-coder-merged, demonstrates 61% execution accuracy when evaluated. The project utilizes a custom validation dataset that incorporates database context into the question. A live demonstration of the model is available on Hugging Face Space, facilitated by the Gradio library for user-friendly GUI.

Note: You might have to wake the Space up if it is sleeping, should take less than 10 minutes.

Spider Skeleton WizardCoder - test-suite-sql-eval Results

With temperature set to 0.0, top_p set to 0.9, and top_k set to 0, the model achieves 61% execution accuracy on the Spider dev set.

Note:

ChatGPT was evaluated with the default hyperparameters and with the system message You are a sophisticated AI assistant capable of converting text into SQL queries. You can only output SQL, don't add any other text.
Both models were evaluated with --plug_value in evaluation.py using the Spider dev set with database context.
- --plug_value: If set, the gold value will be plugged into the predicted query. This is suitable if your model does not predict values. This is set to False by default.

Citation

Please cite the repo if you use the data or code in this repo.

@misc{luo2023wizardcoder,
      title={WizardCoder: Empowering Code Large Language Models with Evol-Instruct}, 
      author={Ziyang Luo and Can Xu and Pu Zhao and Qingfeng Sun and Xiubo Geng and Wenxiang Hu and Chongyang Tao and Jing Ma and Qingwei Lin and Daxin Jiang},
      year={2023},
}

@article{yu2018spider,
  title={Spider: A large-scale human-labeled dataset for complex and cross-domain semantic parsing and text-to-sql task},
  author={Yu, Tao and Zhang, Rui and Yang, Kai and Yasunaga, Michihiro and Wang, Dongxu and Li, Zifan and Ma, James and Li, Irene and Yao, Qingning and Roman, Shanelle and others},
  journal={arXiv preprint arXiv:1809.08887},
  year={2018}
}

@article{dettmers2023qlora,
  title={QLoRA: Efficient Finetuning of Quantized LLMs},
  author={Dettmers, Tim and Pagnoni, Artidoro and Holtzman, Ari and Zettlemoyer, Luke},
  journal={arXiv preprint arXiv:2305.14314},
  year={2023}
}

Disclaimer

The resources, including code, data, and model weights, associated with this project are restricted for academic research purposes only and cannot be used for commercial purposes. The content produced by any version of WizardCoder is influenced by uncontrollable variables such as randomness, and therefore, the accuracy of the output cannot be guaranteed by this project. This project does not accept any legal liability for the content of the model output, nor does it assume responsibility for any losses incurred due to the use of associated resources and output results.

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0002
train_batch_size: 16
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.03
training_steps: 1000
mixed_precision_training: Native AMP

Framework versions

Transformers 4.30.0.dev0
Pytorch 2.0.1+cu118
Datasets 2.13.1
Tokenizers 0.13.3