nielsr HF Staff commited on
Commit
4ad1c7d
·
verified ·
1 Parent(s): efa7197

Add pipeline_tag and library_name to metadata

Browse files

Hi! I'm Niels from the community science team at Hugging Face.

This PR improves the model card by adding `pipeline_tag: text-generation` and `library_name: transformers` to the metadata. These additions help users discover the model through filtering on the Hub and enable the automated code snippet widget for usage with the Transformers library.

The existing documentation, including the usage examples and evaluation results, is already quite comprehensive, so I have preserved those sections.

Files changed (1) hide show
  1. README.md +9 -7
README.md CHANGED
@@ -1,10 +1,12 @@
1
  ---
2
- license: apache-2.0
 
3
  language:
4
  - zh
5
  - en
6
- base_model:
7
- - Qwen/Qwen3-8B
 
8
  ---
9
 
10
  # TableGPT-R1
@@ -180,13 +182,13 @@ Inquiries and feedback are welcome at [j.zhao@zju.edu.cn](mailto:j.zhao@zju.edu.
180
 
181
  TableGPT-R1 demonstrates substantial advancements over its predecessor, TableGPT2-7B, particularly in table comprehension and reasoning capabilities. Detailed comparisons are as follows:
182
 
183
- * **TableBench Benchmark**: TableGPT-R1 demonstrates strong performance. It achieves an average gain of 6.9\% over the Qwen3-8B across four core sub-tasks. Compared to the TableGPT2-7B, it records an average improvement of 3.12\%, validating its enhanced reasoning capability despite a trade-off in the PoT task.
184
 
185
- * **Natural Language to SQL**: TableGPT-R1 exhibits superior generalization capabilities. While showing consistent improvements over Qwen3-8B on Spider 1.0 (+0.66\%) and BIRD (+1.5\%), it represents a significant leap compared to TableGPT2-7B, registering dramatic performance increases of 12.35\% and 13.89\%, respectively.
186
 
187
- * **RealHitBench Test**: In this highly challenging test, TableGPT-R1 achieved outstanding results, particularly surpassing the top closed-source baseline model GPT-4o. This highlights its powerful capabilities in hierarchical table reasoning. Quantitative analysis shows that TableGPT-R1 matches or outperforms Qwen3-8B across subtasks, achieving an average improvement of 11.81\%, with a remarkable peak gain of 31.17\% in the Chart Generation task. Furthermore, compared to TableGPT2-7B, the model represents a significant advancement, registering an average improvement of 19.85\% across all subtasks.
188
 
189
- * **Internal Benchmark**: Evaluation further attests to the model's robustness. TableGPT-R1 surpasses Qwen3-8B by substantial margins: 10.8\% on the Table Info and 8.8\% on the Table Path.
190
 
191
 
192
  | Benchmark | Task | Met. | Q3-8B | T-LLM | Llama | T-R1-Z | TGPT2 | **TGPT-R1** | Q3-14B | Q3-32B | Q3-30B | QwQ | GPT-4o | DS-V3 | Q-Plus | vs.Q3-8B | vs.TGPT2 |
 
1
  ---
2
+ base_model:
3
+ - Qwen/Qwen3-8B
4
  language:
5
  - zh
6
  - en
7
+ license: apache-2.0
8
+ library_name: transformers
9
+ pipeline_tag: text-generation
10
  ---
11
 
12
  # TableGPT-R1
 
182
 
183
  TableGPT-R1 demonstrates substantial advancements over its predecessor, TableGPT2-7B, particularly in table comprehension and reasoning capabilities. Detailed comparisons are as follows:
184
 
185
+ * **TableBench Benchmark**: TableGPT-R1 demonstrates strong performance. It achieves an average gain of 6.9% over the Qwen3-8B across four core sub-tasks. Compared to the TableGPT2-7B, it records an average improvement of 3.12%, validating its enhanced reasoning capability despite a trade-off in the PoT task.
186
 
187
+ * **Natural Language to SQL**: TableGPT-R1 exhibits superior generalization capabilities. While showing consistent improvements over Qwen3-8B on Spider 1.0 (+0.66%) and BIRD (+1.5%), it represents a significant leap compared to TableGPT2-7B, registering dramatic performance increases of 12.35% and 13.89%, respectively.
188
 
189
+ * **RealHitBench Test**: In this highly challenging test, TableGPT-R1 achieved outstanding results, particularly surpassing the top closed-source baseline model GPT-4o. This highlights its powerful capabilities in hierarchical table reasoning. Quantitative analysis shows that TableGPT-R1 matches or outperforms Qwen3-8B across subtasks, achieving an average improvement of 11.81%, with a remarkable peak gain of 31.17% in the Chart Generation task. Furthermore, compared to TableGPT2-7B, the model represents a significant advancement, registering an average improvement of 19.85% across all subtasks.
190
 
191
+ * **Internal Benchmark**: Evaluation further attests to the model's robustness. TableGPT-R1 surpasses Qwen3-8B by substantial margins: 10.8% on the Table Info and 8.8% on the Table Path.
192
 
193
 
194
  | Benchmark | Task | Met. | Q3-8B | T-LLM | Llama | T-R1-Z | TGPT2 | **TGPT-R1** | Q3-14B | Q3-32B | Q3-30B | QwQ | GPT-4o | DS-V3 | Q-Plus | vs.Q3-8B | vs.TGPT2 |