next-tat
/

tat-llm-7b-lora

Model card Files Files and versions Community

tat-llm-7b-lora / README.md

frankliu666's picture

Upload README.md

f191bae 9 months ago

|

2.44 kB

	# TAT-LLM: A Specialized Language Model for Discrete Reasoning over Tabular and Textual Data

	Paper: https://arxiv.org/abs/2401.13223

	Code: https://github.com/fengbinzhu/TAT-LLM


	## Introduction

	We present TAT-LLM, a specialized language model crafted through the innovative Step-wise Pipeline approach, focusing on the nuanced realm of tabular and textual question answering (QA). This model is the fruit of rigorously fine-tuning the LLaMA 2 architecture with a novel dataset, autonomously generated from expertly annotated resources. TAT-LLM stands at the intersection of tabular comprehension and textual analysis, engineered to excel by embodying three fundamental phases: Extraction, Reasoning, and Execution. Our empirical findings illuminate TAT-LLM's remarkable capability to eclipse traditional benchmarks, surmounting even the most advanced models and colossal language models such as GPT-4 across a suite of demanding QA tasks like FinQA, TAT-QA, and TAT-DQA. This endeavor not only sets a new standard for task-specific language models but also paves the way for future explorations in optimizing smaller models for highly specialized functions.

	\| Model \| Size \| FINQA \| TATQA \| TATDQA \|
	\| --- \| --- \| --- \| --- \| --- \|
	\| GPT-3.5-Turbo \| - \| 58.00 \| 59.47 \| 52.74 \|
	\| GPT-4 \| - \| 63.91 \| 71.92 \| 64.46 \|
	\| TAT-LLM-7B \| 7B \| 65.13 \| 76.49 \| 71.38 \|
	\| TAT-LLM-13B \| 13B \| 71.93 \| 77.51 \| 72.22 \|
	\| TAT-LLM-70B \| 70B \| 76.81 \| 81.42 \| 76.55 \|


	## Training

	We train our TAT-LLM model in various sizes, including 7B, 13B, and 70B, by fine-tuning LLaMA 2 using Low-Rank Adaptation (LoRa) on a combination of the train sets from FinQA, TAT-QA and TAT-DQA datasets. To refine accuracy, we introduce an External Executor, enhancing the model by processing intermediate outputs to derive conclusive answers. Please refer to the [paper](https://arxiv.org/abs/2401.13223) for more details.

	## Inference & Evaluation

	Please refer to code [here](https://github.com/fengbinzhu/TAT-LLM)

	## Citation

	If you find this repository helpful, please consider citing our paper:

	```
	@misc{zhu2024tatllm,
	title={TAT-LLM: A Specialized Language Model for Discrete Reasoning over Tabular and Textual Data},
	author={Fengbin Zhu and Ziyang Liu and Fuli Feng and Chao Wang and Moxin Li and Tat-Seng Chua},
	year={2024},
	eprint={2401.13223},
	archivePrefix={arXiv},
	primaryClass={cs.CL}
	}
	```