Spaces:

richardr1126
/

sql-skeleton-wizardcoder-demo

Paused

App Files Files Community

richardr1126 commited on Jul 29, 2023

Commit

dd9d480

1 Parent(s): ef53845

New and improved

Browse files

Files changed (2) hide show

README.md +30 -0
app-ngrok.py +15 -5

README.md CHANGED Viewed

@@ -10,4 +10,34 @@ pinned: true
 license: bigcode-openrail-m
 ---
 Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 license: bigcode-openrail-m
 ---
+## Citations
+```
+@misc{luo2023wizardcoder,
+      title={WizardCoder: Empowering Code Large Language Models with Evol-Instruct},
+      author={Ziyang Luo and Can Xu and Pu Zhao and Qingfeng Sun and Xiubo Geng and Wenxiang Hu and Chongyang Tao and Jing Ma and Qingwei Lin and Daxin Jiang},
+      year={2023},
+}
+```
+```
+@article{yu2018spider,
+  title={Spider: A large-scale human-labeled dataset for complex and cross-domain semantic parsing and text-to-sql task},
+  author={Yu, Tao and Zhang, Rui and Yang, Kai and Yasunaga, Michihiro and Wang, Dongxu and Li, Zifan and Ma, James and Li, Irene and Yao, Qingning and Roman, Shanelle and others},
+  journal={arXiv preprint arXiv:1809.08887},
+  year={2018}
+}
+```
+```
+@article{dettmers2023qlora,
+  title={QLoRA: Efficient Finetuning of Quantized LLMs},
+  author={Dettmers, Tim and Pagnoni, Artidoro and Holtzman, Ari and Zettlemoyer, Luke},
+  journal={arXiv preprint arXiv:2305.14314},
+  year={2023}
+}
+```
+## Disclaimer
+The resources, including code, data, and model weights, associated with this project are restricted for academic research purposes only and cannot be used for commercial purposes. The content produced by any version of WizardCoder is influenced by uncontrollable variables such as randomness, and therefore, the accuracy of the output cannot be guaranteed by this project. This project does not accept any legal liability for the content of the model output, nor does it assume responsibility for any losses incurred due to the use of associated resources and output results.
 Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

app-ngrok.py CHANGED Viewed

@@ -3,6 +3,9 @@ import gradio as gr
 import sqlparse
 import requests
 from time import sleep
 def format(text):
     # Split the text by "|", and get the last element in the list which should be the final query
@@ -23,8 +26,7 @@ def format(text):
     return final_query_markdown
-def bot(input_message: str, db_info="", temperature=0.3, top_p=0.9, top_k=0, repetition_penalty=1.08):
     # Format the user's input message
     messages = f"Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.\n\n### Instruction:\n\nConvert text to sql: {input_message} {db_info}\n\n### Response:\n\n"
@@ -62,10 +64,11 @@ def bot(input_message: str, db_info="", temperature=0.3, top_p=0.9, top_k=0, rep
             print('Waiting for 10 seconds before retrying...')
             sleep(10)
 with gr.Blocks(theme='gradio/soft') as demo:
     header = gr.HTML("""
         <h1 style="text-align: center">SQL Skeleton WizardCoder Demo</h1>
-        <h3 style="text-align: center">🧙‍♂️ Generate SQL queries from Natural Language 🧙‍♂️</h3>
     """)
     output_box = gr.Code(label="Generated SQL", lines=2, interactive=True)
@@ -79,6 +82,7 @@ with gr.Blocks(theme='gradio/soft') as demo:
         repetition_penalty = gr.Slider(label="Repetition Penalty", minimum=1.0, maximum=2.0, value=1.08, step=0.01)
     run_button = gr.Button("Generate SQL", variant="primary")
     with gr.Accordion("Examples", open=True):
         examples = gr.Examples([
@@ -87,7 +91,7 @@ with gr.Blocks(theme='gradio/soft') as demo:
             ["What are the number of concerts that occurred in the stadium with the largest capacity ?", "| stadium : stadium_id , location , name , capacity , highest , lowest , average | singer : singer_id , name , country , song_name , song_release_year , age , is_male | concert : concert_id , concert_name , theme , stadium_id , year | singer_in_concert : concert_id , singer_id | concert.stadium_id = stadium.stadium_id | singer_in_concert.singer_id = singer.singer_id | singer_in_concert.concert_id = concert.concert_id |"],
             ["How many male singers performed in concerts in the year 2023?", "| stadium : stadium_id , location , name , capacity , highest , lowest , average | singer : singer_id , name , country , song_name , song_release_year , age , is_male | concert : concert_id , concert_name , theme , stadium_id , year | singer_in_concert : concert_id , singer_id | concert.stadium_id = stadium.stadium_id | singer_in_concert.singer_id = singer.singer_id | singer_in_concert.concert_id = concert.concert_id |"],
             ["List the names of all singers who performed in a concert with the theme 'Rock'", "| stadium : stadium_id , location , name , capacity , highest , lowest , average | singer : singer_id , name , country , song_name , song_release_year , age , is_male | concert : concert_id , concert_name , theme , stadium_id , year | singer_in_concert : concert_id , singer_id | concert.stadium_id = stadium.stadium_id | singer_in_concert.singer_id = singer.singer_id | singer_in_concert.concert_id = concert.concert_id |"]
-        ], inputs=[input_text, db_info, temperature, top_p, top_k, repetition_penalty], fn=bot, cache_examples=True, outputs=output_box)
     quantized_model = "richardr1126/spider-skeleton-wizard-coder-ggml"
     merged_model = "richardr1126/spider-skeleton-wizard-coder-merged"
@@ -100,9 +104,15 @@ with gr.Blocks(theme='gradio/soft') as demo:
         <p>🌐 Leveraging the <a href='https://huggingface.co/{quantized_model}'><strong>4-bit GGML version</strong></a> of <a href='https://huggingface.co/{merged_model}'><strong>{merged_model}</strong></a> model.</p>
         <p>🔗 How it's made: <a href='https://huggingface.co/{initial_model}'><strong>{initial_model}</strong></a> was finetuned to create <a href='https://huggingface.co/{lora_model}'><strong>{lora_model}</strong></a>, then merged together to create <a href='https://huggingface.co/{merged_model}'><strong>{merged_model}</strong></a>.</p>
         <p>📉 Fine-tuning was performed using QLoRA techniques on the <a href='https://huggingface.co/datasets/{dataset}'><strong>{dataset}</strong></a> dataset. You can view training metrics on the <a href='https://huggingface.co/{lora_model}'><strong>QLoRa adapter HF Repo</strong></a>.</p>
     """)
-    run_button.click(fn=bot, inputs=[input_text, db_info, temperature, top_p, top_k, repetition_penalty], outputs=output_box, api_name="txt2sql")
 demo.queue(concurrency_count=1, max_size=10).launch(debug=True)

 import sqlparse
 import requests
 from time import sleep
+import re
 def format(text):
     # Split the text by "|", and get the last element in the list which should be the final query
     return final_query_markdown
+def generate(input_message: str, db_info="", temperature=0.3, top_p=0.9, top_k=0, repetition_penalty=1.08):
     # Format the user's input message
     messages = f"Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.\n\n### Instruction:\n\nConvert text to sql: {input_message} {db_info}\n\n### Response:\n\n"
             print('Waiting for 10 seconds before retrying...')
             sleep(10)
+# Gradio UI Code
 with gr.Blocks(theme='gradio/soft') as demo:
     header = gr.HTML("""
         <h1 style="text-align: center">SQL Skeleton WizardCoder Demo</h1>
+        <h3 style="text-align: center">🕷️☠️🧙‍♂️ Generate SQL queries from Natural Language 🕷️☠️🧙‍♂️</h3>
     """)
     output_box = gr.Code(label="Generated SQL", lines=2, interactive=True)
         repetition_penalty = gr.Slider(label="Repetition Penalty", minimum=1.0, maximum=2.0, value=1.08, step=0.01)
     run_button = gr.Button("Generate SQL", variant="primary")
+    run_button.click(fn=generate, inputs=[input_text, db_info, temperature, top_p, top_k, repetition_penalty], outputs=output_box, api_name="txt2sql")
     with gr.Accordion("Examples", open=True):
         examples = gr.Examples([
             ["What are the number of concerts that occurred in the stadium with the largest capacity ?", "| stadium : stadium_id , location , name , capacity , highest , lowest , average | singer : singer_id , name , country , song_name , song_release_year , age , is_male | concert : concert_id , concert_name , theme , stadium_id , year | singer_in_concert : concert_id , singer_id | concert.stadium_id = stadium.stadium_id | singer_in_concert.singer_id = singer.singer_id | singer_in_concert.concert_id = concert.concert_id |"],
             ["How many male singers performed in concerts in the year 2023?", "| stadium : stadium_id , location , name , capacity , highest , lowest , average | singer : singer_id , name , country , song_name , song_release_year , age , is_male | concert : concert_id , concert_name , theme , stadium_id , year | singer_in_concert : concert_id , singer_id | concert.stadium_id = stadium.stadium_id | singer_in_concert.singer_id = singer.singer_id | singer_in_concert.concert_id = concert.concert_id |"],
             ["List the names of all singers who performed in a concert with the theme 'Rock'", "| stadium : stadium_id , location , name , capacity , highest , lowest , average | singer : singer_id , name , country , song_name , song_release_year , age , is_male | concert : concert_id , concert_name , theme , stadium_id , year | singer_in_concert : concert_id , singer_id | concert.stadium_id = stadium.stadium_id | singer_in_concert.singer_id = singer.singer_id | singer_in_concert.concert_id = concert.concert_id |"]
+        ], inputs=[input_text, db_info, temperature, top_p, top_k, repetition_penalty], fn=generate, cache_examples=True, outputs=output_box)
     quantized_model = "richardr1126/spider-skeleton-wizard-coder-ggml"
     merged_model = "richardr1126/spider-skeleton-wizard-coder-merged"
         <p>🌐 Leveraging the <a href='https://huggingface.co/{quantized_model}'><strong>4-bit GGML version</strong></a> of <a href='https://huggingface.co/{merged_model}'><strong>{merged_model}</strong></a> model.</p>
         <p>🔗 How it's made: <a href='https://huggingface.co/{initial_model}'><strong>{initial_model}</strong></a> was finetuned to create <a href='https://huggingface.co/{lora_model}'><strong>{lora_model}</strong></a>, then merged together to create <a href='https://huggingface.co/{merged_model}'><strong>{merged_model}</strong></a>.</p>
         <p>📉 Fine-tuning was performed using QLoRA techniques on the <a href='https://huggingface.co/datasets/{dataset}'><strong>{dataset}</strong></a> dataset. You can view training metrics on the <a href='https://huggingface.co/{lora_model}'><strong>QLoRa adapter HF Repo</strong></a>.</p>
     """)
+    readme_content = requests.get(f"https://huggingface.co/{merged_model}/raw/main/README.md").text
+    readme_content = re.sub('---.*?---', '', readme_content, flags=re.DOTALL) #Remove YAML front matter
+    with gr.Accordion("📖 Model Readme", open=True):
+        readme = gr.Markdown(
+            readme_content,
+        )
 demo.queue(concurrency_count=1, max_size=10).launch(debug=True)