File size: 2,792 Bytes
0aa22f7
 
 
162c944
 
 
 
 
db81a16
0aa22f7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
69121d4
0aa22f7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
162c944
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
---
language:
- en
tags:
- text2sql
- spider
- Transformer
- Pytorch
license: mit
---
## Model Description

Graphix-T5 is a graph-aware semi-pretrained text-to-text PLM specifically designed to improve multi-hop reasoning for the complex text-to-SQL task. 
This novel architecture enhances the structural encoding capabilities of the T5 model while preserving its powerful contextual encoding ability. 
The experimental results demonstrate the effectiveness of GRAPHIX-T5 and underscore the importance of incorporating structural information in text-to-text PLMs for tackling intricate text-to-SQL challenges. 
The smaller gap in performance between the dev and test sets indicates the stronger generalization capability of Graphix-T5.

## Training Data
Graphix-3B is trained based on SPIDER, a cross-domain text-to-SQL benchmark. And it's evaluated in vanilla SPIDER dev, test, and other variants: SPIDER-SYN, SPIDER-DK, 
SPIDER-REALISTIC **without additional training**. This model will continue to be fine-tuned on more complex text-to-SQL data, 
i.e. BIRD to deal with harder but more real applications

## To Begin With

You can use this model directly with a pipeline for text generation. This example generates a different sequence each time it's run:
```py
from transformers import AutoTokenizer, AutoModel

tokenizer = AutoTokenizer.from_pretrained("patrickNLP/Graphix-3B")

model = AutoModel.from_pretrained("patrickNLP/Graphix-3B")
```

## Performance
Graphix-3B w/ Picard maintains state-of-the-art (SOTA) semantic parsing capabilities, as demonstrated by its performance on the [`SPIDER`](https://yale-lily.github.io/spider) leaderboard. Its only submission achieves **74.0%** on EM and **77.6%** on EX in the testing dataset.
Please see [`Graphix Official Implementation`]() for details.

## Reference
1. [`Graphix-T5: Mixing Pre-Trained Transformers with Graph-Aware Layers for Text-to-SQL Parsing`](https://arxiv.org/abs/2301.07507)
2. [`Can LLM Already Serve as A Database Interface? A BIg Bench for Large-Scale Database Grounded Text-to-SQLs`](https://arxiv.org/abs/2305.03111)
3. [`Spider: A Large-Scale Human-Labeled Dataset for Complex and Cross-Domain Semantic Parsing and Text-to-SQL Task`](https://arxiv.org/abs/1809.08887)
4. [`PICARD: Parsing Incrementally for Constrained Auto-Regressive Decoding from Language Models`](https://arxiv.org/abs/2109.05093)


## Citation
```
@misc{li2023graphixt5,
      title={Graphix-T5: Mixing Pre-Trained Transformers with Graph-Aware Layers for Text-to-SQL Parsing}, 
      author={Jinyang Li and Binyuan Hui and Reynold Cheng and Bowen Qin and Chenhao Ma and Nan Huo and Fei Huang and Wenyu Du and Luo Si and Yongbin Li},
      year={2023},
      eprint={2301.07507},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}
```