C5i
/

NatSight-t5-small-wikisql

Text2Text Generation

NatSight-AdpSeq2Seq

Inference Endpoints

text-generation-inference

Model card Files Files and versions Community

rohitsroch commited on Feb 15, 2023

Commit

ed8bed3

•

1 Parent(s): 790f0b6

Update README.md

Files changed (1) hide show

README.md +31 -2

README.md CHANGED Viewed

@@ -7,12 +7,14 @@ tags:
 - Text2SQL
 datasets:
 - wikisql
 ---
 ## Paper
 ## [NatSight: A framework for building domain agnostic Natural Language Interface to Databases for next-gen Augmented Analytics](https://dcal.iimb.ac.in/baiconf2022/full_papers/2346.pdf)
-Aurthors: *Rohit Sroch*, *Dhiraj Patnaik*, *Jayachandran Ramachandran*
 ## Abstract
@@ -29,8 +31,35 @@ Experiment results on benchmark datasets show that our approach achieves a state
 ## NatSight-t5-small-wikisql
- For weights initialization, we used [t5-small](https://huggingface.co/t5-small)
 ## Intended uses & limitations

 - Text2SQL
 datasets:
 - wikisql
+widget:
+- text: "translate English to Sql: What was the number of race that Kevin Curtain won? </s> c0 | number <eom>  v4 | Kevin Curtain </s> c0 | No <eom> c1 | Date <eom> c2 | Round <eom> c3 | Circuit <eom> c4 | Pole_Position <eom> c5 | Fastest_Lap <eom> c6 | Race_winner <eom> c7 | Report"
 ---
 ## Paper
 ## [NatSight: A framework for building domain agnostic Natural Language Interface to Databases for next-gen Augmented Analytics](https://dcal.iimb.ac.in/baiconf2022/full_papers/2346.pdf)
+Authors: *Rohit Sroch*, *Dhiraj Patnaik*, *Jayachandran Ramachandran*
 ## Abstract
 ## NatSight-t5-small-wikisql
+ For weights initialization, we used [t5-small](https://huggingface.co/t5-small) and fine-tune as sequence-to-sequence task.
+## Using Transformers🤗
+```python
+from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
+tokenizer = AutoTokenizer.from_pretrained("course5i/NatSight-t5-small-wikisql")
+model = AutoModelForSeq2SeqLM.from_pretrained("course5i/NatSight-t5-small-wikisql")
+# define input
+prefix = "translate English to Sql: "
+raw_nat_query = "What was the number of race that Kevin Curtain won?"
+query_mention_schema = "c0 | number <eom>  v4 | Kevin Curtain"
+table_header_schema = "c0 | No <eom> c1 | Date <eom> c2 | Round <eom> c3 | Circuit <eom> c4 | Pole_Position <eom> c5 | Fastest_Lap <eom> c6 | Race_winner <eom> c7 | Report"
+encoder_input = prefix + raw_nat_query + " </s> " + query_mention_schema + " </s> " + table_header_schema
+input_ids = tokenizer.encode(encoder_input, return_tensors="pt", add_special_tokens=True)
+generated_ids = model.generate(input_ids=input_ids, num_beams=5, max_length=128)
+preds = [tokenizer.decode(g, skip_special_tokens=True, clean_up_tokenization_spaces=True) for g in generated_ids]
+output = preds[0]
+print("Output generic SQL query: {}".format(output))
+# output
+"SELECT COUNT(c0) FROM TABLE WHERE c4 = v4"
+```
 ## Intended uses & limitations