--- license: apache-2.0 pipeline_tag: text-generation tags: - code datasets: - semiotic/SynQL-KaggleDBQA-Train language: - en base_model: - google-t5/t5-3b --- # Model Card for T5-3B/SynQL-KaggleDBQA-Train-Run-01 - Developed by: Semiotic Labs - Model type: [Text to SQL] - License: [Apache-2.0] - Finetuned from model: [google-t5/t5-3b](https://huggingface.co/google-t5/t5-3b) - Dataset used for finetuning: [semiotic/SynQL-KaggleDBQA-Train](https://huggingface.co/datasets/semiotic/SynQL-KaggleDBQA-Train/blob/main/README.md) ## Model Context Example metadata can be found below, context represents the prompt that is presented to the model. Database schemas follow the encoding method proposed by [Shaw et al (2020)](https://arxiv.org/pdf/2010.12725). ``` "query": "SELECT count(*) FROM singer", "question": "How many singers do we have?", "context": "How many singers do we have? | concert_singer | stadium : stadium_id, location, name, capacity, highest, lowest, average | singer : singer_id, name, country, song_name, song_release_year, age, is_male | concert : concert_id, concert_name, theme, stadium_id, year | singer_in_concert : concert_id, singer_id", "db_id": "concert_singer", ``` ## Model Results Evaluation set: [KaggleDBQA/test](https://github.com/Chia-Hsuan-Lee/KaggleDBQA) Evaluation metrics: [Execution Accuracy] | Model | Data | Run | Execution Accuracy | |-------|------|-----|-------------------| | T5-3B | semiotic/SynQL-KaggleDBQA | 00 | 0.3514 | | T5-3B | semiotic/SynQL-KaggleDBQA | 01 | 0.3514 | | T5-3B | semiotic/SynQL-KaggleDBQA | 02 | 0.3514 |