t5-schemapile-fk / README.md
tdoehmen's picture
Update README.md
cccb88f verified
---
license: cc-by-4.0
---
# SchemaPile Foreign Key Detection Model (T5-base)
## Model Description
In this repository we are introducing **t5-schemapile-fk**. It's a language model, based on [google-t5/t5-base](https://huggingface.co/google-t5/t5-base) fine-tuned for predicting foreign key relationships in relational database schemas.
## Training Data
Forein key pairs extracted from [SchemaPile-Perm](https://schemapile.github.io), a large collection of relational database schemas.
## Evaluation Data
We evaluate the foreign key detection accuracy of [starcoder-schemapile-fk](https://huggingface.co/tdoehmen/starcoder-schemapile-fk) and [t5-schemapile-fk](https://huggingface.co/tdoehmen/t5-schemapile-fk) on schemas from [Spider](https://yale-lily.github.io/spider), [BIRD-SQL](https://bird-bench.github.io/), and [CTU PRLR](https://arxiv.org/abs/1511.03086).
<img src="https://cdn-uploads.huggingface.co/production/uploads/616ea71919594606318887e9/6ouh4u6PFQlY8prLrAm4l.png" alt="eval" width="400"/>
## Training Procedure
The model was trained using the following hyperparamters:
- batch_size = 16
- learning_rate=4e-5,
- weight_decay=0.01,
- num_train_epochs=1
See [Training Code](https://github.com/amsterdata/schemapile/blob/main/experiments/foreign_key_detection/finetune-t5-schemapile.ipynb).
## How to Use
We recommend using the following prompt template:
Example Prompt:
```
You are given the following SQL database tables:
staff(staff_id, staff_address_id, nickname, first_name, middle_name, last_name, date_of_birth, date_joined_staff, date_left_staff)
addresses(address_id, line_1_number_building, city, zip_postcode, state_province_county, country)
Output a json string with the following schema {table, column, referencedTable, referencedColumn} that contains the foreign key relationship between the two tables.
```
Example Output:
```
{'table': 'staff',
'column': 'staff_address_id',
'referencedTable': 'addresses',
'referencedColumn': 'address_id'}
```
To run the model locally, we recommend using our end-to-end [Example Notebook](https://github.com/amsterdata/schemapile/blob/main/experiments/foreign_key_detection/t5-schemapile-fk-example.ipynb).