--- language: en tags: - table-to-text datasets: - logicnlg --- # ReasTAP ReasTAP is a table reasoning model proposed in the EMNLP 2022 paper [ReasTAP: Injecting Table Reasoning Skills During Pre-training via Synthetic Reasoning Examples](https://arxiv.org/pdf/2210.12374.pdf). The original Github repository is [https://github.com/Yale-LILY/ReasTAP](https://github.com/Yale-LILY/ReasTAP). ## Description `Yale-LILY/reastap-large-finetuned-logicnlg` is initialized with `Yale-LILY/reastap-large` and finetuned on [LogicNLG](https://arxiv.org/pdf/2004.10404.pdf). ## Usage ```python from transformers import AutoTokenizer, AutoModelForSeq2SeqLM import pandas as pd tokenizer = AutoTokenizer.from_pretrained("Yale-LILY/reastap-large-finetuned-logicnlg") model = AutoModelForSeq2SeqLM.from_pretrained("Yale-LILY/reastap-large-finetuned-logicnlg") data = { "year": [1896, 1900, 1904, 2004, 2008, 2012], "city": ["athens", "paris", "st. louis", "athens", "beijing", "london"] } table = pd.DataFrame.from_dict(data) title = "Olympic Games" encoding = tokenizer(table=table, query=title, return_tensors="pt") outputs = model.generate(**encoding) print(tokenizer.batch_decode(outputs, skip_special_tokens=True)) # the olympic game were held in athens 2 time, in 1896 and 2004' ``` ## Reference ```bibtex @inproceedings{zhao-etal-2022-reastap, title = "{R}eas{TAP}: Injecting Table Reasoning Skills During Pre-training via Synthetic Reasoning Examples", author = "Zhao, Yilun and Nan, Linyong and Qi, Zhenting and Zhang, Rui and Radev, Dragomir", booktitle = "Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing", month = dec, year = "2022", address = "Abu Dhabi, United Arab Emirates", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2022.emnlp-main.615", pages = "9006--9018", abstract = "Reasoning over tabular data requires both table structure understanding and a broad set of table reasoning skills. Current models with table-specific architectures and pre-training methods perform well on understanding table structures, but they still struggle with tasks that require various table reasoning skills. In this work, we develop ReasTAP to show that high-level table reasoning skills can be injected into models during pre-training without a complex table-specific architecture design. We define 7 table reasoning skills, such as numerical operation, temporal comparison, and conjunction. Each reasoning skill is associated with one example generator, which synthesizes questions over semi-structured tables according to the sampled templates. We model the table pre-training task as a sequence generation task and pre-train ReasTAP to generate precise answers of the synthetic examples. ReasTAP is evaluated on four benchmarks covering three downstream tasks including 1) WikiSQL-Weak and WikiTQ for Table Question Answering, 2) TabFact for Table Fact Verification, and 3) LogicNLG for Faithful Table-to-Text Generation. Experimental results demonstrate that ReasTAP achieves new state-of-the-art results on all of them and delivers a significant improvement under low-resource setting. Our code is publicly available at https://github.com/Yale-LILY/ReasTAP.", } ```