Papers
arxiv:2310.09263

Table-GPT: Table-tuned GPT for Diverse Table Tasks

Published on Oct 13, 2023
· Featured in Daily Papers on Oct 16, 2023
Authors:
,
,
,
,
,
,
,
,

Abstract

Language models, such as GPT-3.5 and ChatGPT, demonstrate remarkable abilities to follow diverse human instructions and perform a wide range of tasks. However, when probing language models using a range of basic table-understanding tasks, we observe that today's language models are still sub-optimal in many table-related tasks, likely because they are pre-trained predominantly on one-dimensional natural-language texts, whereas relational tables are two-dimensional objects. In this work, we propose a new "table-tuning" paradigm, where we continue to train/fine-tune language models like GPT-3.5 and ChatGPT, using diverse table-tasks synthesized from real tables as training data, with the goal of enhancing language models' ability to understand tables and perform table tasks. We show that our resulting Table-GPT models demonstrate (1) better table-understanding capabilities, by consistently outperforming the vanilla GPT-3.5 and ChatGPT, on a wide-range of table tasks, including holdout unseen tasks, and (2) strong generalizability, in its ability to respond to diverse human instructions to perform new table-tasks, in a manner similar to GPT-3.5 and ChatGPT.

Community

Can you opensource your training datasets of the different table tasks?

This comment has been hidden

IMG-20210721-WA0010.jpg

Write a top 5points

Screenshot 2023-10-16 154255.png

start discussing about this paper

Wtf is going on in these comments lol? Anyway, here's my summary...

Tables are everywhere - reports, databases, webpages. They neatly organize data for humans to parse. But despite strong language skills, AI still struggles with table comprehension.

Even models like GPT-3 fail at basic tasks like finding where a missing value should go. This is because they're trained mostly on free-flowing text, not 2D tabular data. Unlike unstructured text, data in tables derives meaning from its structure and position!

So researchers at Microsoft tried "table-tuning" - extending training with synthesized table task cases. Tasks like "impute missing value X" or "identify outliers in this table". They did this using a corpus of real-world tables.

They also augmented the data more by paraphrasing, reordering rows/columns, chaining model responses, and more. This protects against overfitting.

The resulting Table-GPT models showed big improvements:

  • 25%+ better at unseen table tasks like missing value ID
  • Beat GPT-3 on 98% of test cases over 9 different table tasks
  • Stayed strong even after targeted downstream tuning

Table-tuning seems a promising step toward AI that can handle tables. That would unlock automated analysis over the troves of valuable tabular data out there.

TLDR: Training models on a large and diverse dataset of synthesized table tasks significantly boosts their table skills.

Full Summary is here.

Can you opensource your training datasets of the different table tasks?

Not sure about that but here's a table dataset that is more than 800 billion tokens!

https://huggingface.co/datasets/approximatelabs/tablib-v1-full

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

I am still not very sure with GPT capabilities of tackling arithmetical calculation in any context let it be tables or time series data. Till now the output that I have seen is not even close.

Key conclusion from my review of v1 of this paper (published on 13th Oct 2023):

This paper offers good introduction to simple table-tuning tasks, however task T-3 (TQA) should be significantly improved before Table-GPT can be used commercially.

Key points:
• Generic overview of results indicates very good results of table-tuning, however in my opinion tasks should be divided based on complexity to better understand value of table-tuning. Please see diagram below.

image.png

• For the most of easy tasks (all tasks except T-3) table-tuning offers great improvements in zero-shot comparing to vanilla models (209% improvement for GPT-3.5 and 119% for ChatGPT)
• For the most of easy tasks (all tasks except T-3) table-tuning offers good results in few-shot comparing to vanilla models (22% improvement for GPT-3.5 and 12% for ChatGPT)

Table-GPT paper review all except T-3.png

• T-3 TQA is the most complex task (and with biggest business demand) and for this tasks table-tuning offers very small improvements (1-2% for ChatGPT and 5-8% for GPT-3.5), which is probably not worth of the fine-tuning effort

Table-GPT paper review T-3.png

Open questions:
• Do you have plans to fine-tune GPT-4?
• Can you share recommendations on improving T-3 (TQA)? Maybe including TQA tasks in training?
• Can you include as well T-12 (NS) in tests?
• Can you specify number of tokens used (both for training and test execution) for each task

Other remarks:
• Markdown format increases performance of table-tuning by 3% comparing to CSV and by 5% comparing to JSON (table 5)
• For most of the tasks few-shot offers strong improvement over zero-shot for vanilla GPT 3.5 and ChatGPT (even without table-tuning).
• Typos found in paper:
- p.4 is “toke-by-token” should be “token-by-token”
- p.6 is “few select table-tasks” should be “few selected table-tasks”
- p.7 is “describes the row-augmentation task” should be “describes the column-augmentation task”

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2310.09263 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2310.09263 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2310.09263 in a Space README.md to link it from this page.

Collections including this paper 19