English
frankliu666 commited on
Commit
89eacba
·
1 Parent(s): c1ca442

Updates README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -6
README.md CHANGED
@@ -13,20 +13,23 @@ Code: https://github.com/fengbinzhu/TAT-LLM
13
 
14
  ## Introduction
15
 
16
- We present TAT-LLM, a specialized language model crafted through the innovative Step-wise Pipeline approach, focusing on the nuanced realm of tabular and textual question answering (QA). This model is the fruit of rigorously fine-tuning the LLaMA 2 architecture with a novel dataset, autonomously generated from expertly annotated resources. TAT-LLM stands at the intersection of tabular comprehension and textual analysis, engineered to excel by embodying three fundamental phases: Extraction, Reasoning, and Execution. Our empirical findings illuminate TAT-LLM's remarkable capability to eclipse traditional benchmarks, surmounting even the most advanced models and colossal language models such as GPT-4 across a suite of demanding QA tasks like FinQA, TAT-QA, and TAT-DQA. This endeavor not only sets a new standard for task-specific language models but also paves the way for future explorations in optimizing smaller models for highly specialized functions.
17
 
18
  | Model | Size | FINQA | TATQA | TATDQA |
19
  | --- | --- | --- | --- | --- |
20
  | GPT-3.5-Turbo | - | 58.00 | 59.47 | 52.74 |
21
  | GPT-4 | - | 63.91 | 71.92 | 64.46 |
22
- | TAT-LLM-7B | 7B | 65.13 | 76.49 | 71.38 |
23
- | TAT-LLM-13B | 13B | 71.93 | 77.51 | 72.22 |
24
- | TAT-LLM-70B | 70B | **76.81** | **81.42** | **76.55** |
 
 
 
25
 
26
 
27
  ## Training
28
 
29
- We train our TAT-LLM model in various sizes, including 7B, 13B, and 70B, by fine-tuning LLaMA 2 using Low-Rank Adaptation (LoRa) on a combination of the train sets from FinQA, TAT-QA and TAT-DQA datasets. To refine accuracy, we introduce an External Executor, enhancing the model by processing intermediate outputs to derive conclusive answers. Please refer to the [paper](https://arxiv.org/abs/2401.13223) for more details.
30
 
31
  ## Inference & Evaluation
32
 
@@ -34,7 +37,7 @@ Please refer to code [here](https://github.com/fengbinzhu/TAT-LLM)
34
 
35
  ## Citation
36
 
37
- If you find this repository helpful, please consider citing our paper:
38
 
39
  ```
40
  @misc{zhu2024tatllm,
 
13
 
14
  ## Introduction
15
 
16
+ We present TAT-LLM, a specialized language model crafted through the innovative Step-wise Pipeline approach, focusing on the nuanced realm of tabular and textual question answering (QA). This model is the fruit of rigorously fine-tuning the LLaMA 2 architecture with a novel dataset, autonomously generated from expertly annotated resources. TAT-LLM stands at the intersection of tabular comprehension and textual analysis, engineered to excel by embodying three fundamental phases: Extraction, Reasoning, and Execution. Our empirical findings illuminate TAT-LLM's remarkable capability to eclipse traditional benchmarks, surmounting even the most advanced models and colossal language models such as GPT-4 across a suite of demanding financial QA tasks like FinQA, TAT-QA, and TAT-DQA. This endeavor not only sets a new standard for task-specific language models but also paves the way for future explorations in optimizing smaller models for highly specialized functions.
17
 
18
  | Model | Size | FINQA | TATQA | TATDQA |
19
  | --- | --- | --- | --- | --- |
20
  | GPT-3.5-Turbo | - | 58.00 | 59.47 | 52.74 |
21
  | GPT-4 | - | 63.91 | 71.92 | 64.46 |
22
+ | TAT-LLM-7B-LORA | 7B | 65.13 | 76.49 | 71.38 |
23
+ | TAT-LLM-7B-FFT | 7B | 69.75 | 76.91 | 72.64 |
24
+ | TAT-LLM-13B-LORA | 13B | 71.93 | 77.51 | 72.22 |
25
+ | TAT-LLM-13B-FFT | 13B | 72.97 | 78.41 | 73.18 |
26
+ | TAT-LLM-70B-LORA | 70B | **76.81** | 81.42 | 76.55 |
27
+ | TAT-LLM-70B-FFT | 70B | 76.11 | **82.20** | **76.97** |
28
 
29
 
30
  ## Training
31
 
32
+ We train our TAT-LLM model in various sizes, including 7B, 13B, and 70B, using different methods such as parameter-efficient fine-tuning and full-parameter fine-tuning of LLaMA 2 on a combination of financial data from the FinQA, TAT-QA, and TAT-DQA datasets. To refine accuracy, we introduce an External Executor, enhancing the model by processing intermediate outputs to derive conclusive answers. Please refer to the [paper](https://arxiv.org/abs/2401.13223) for more details.
33
 
34
  ## Inference & Evaluation
35
 
 
37
 
38
  ## Citation
39
 
40
+ If you find this model helpful, please consider citing our paper:
41
 
42
  ```
43
  @misc{zhu2024tatllm,