Spaces:

GTBench
/

GTBench

Running

jhao commited on Feb 21

Commit

2314beb

•

1 Parent(s): 9c2a184

Update reference and arxiv paper link

Files changed (1) hide show

src/display/about.py CHANGED Viewed

@@ -4,6 +4,9 @@ TITLE = """
 <h1 id="space-title">GTBench: Uncovering the Strategic Reasoning Limitation of LLMs via<br> Game-Theoretic Evaluations</h1>"""
 INTRODUCTION_TEXT = """
 GTBench aims to evaluate and rank LLMs’ reasoning abilities in competitive environments through game-theoretic tasks, e.g., board and card games.
 It utilizes 10 widely recognized games supported by <a href="https://github.com/google-deepmind/open_spiel">OpenSpiel</a> and evaluate well-recognized LLM agents in a language-driven manner. The evaluation code and prompt templates can be found in <a href="https://github.com/jinhaoduan/GTBench" target="_blank" >GTBench</a>.
@@ -58,4 +61,10 @@ EVALUATION_QUEUE_TEXT = """
 CITATION_BUTTON_LABEL = "Copy the following snippet to cite these results"
 CITATION_BUTTON_TEXT = r"""
 """

 <h1 id="space-title">GTBench: Uncovering the Strategic Reasoning Limitation of LLMs via<br> Game-Theoretic Evaluations</h1>"""
 INTRODUCTION_TEXT = """
+paper: https://arxiv.org/abs/2402.12348
 GTBench aims to evaluate and rank LLMs’ reasoning abilities in competitive environments through game-theoretic tasks, e.g., board and card games.
 It utilizes 10 widely recognized games supported by <a href="https://github.com/google-deepmind/open_spiel">OpenSpiel</a> and evaluate well-recognized LLM agents in a language-driven manner. The evaluation code and prompt templates can be found in <a href="https://github.com/jinhaoduan/GTBench" target="_blank" >GTBench</a>.
 CITATION_BUTTON_LABEL = "Copy the following snippet to cite these results"
 CITATION_BUTTON_TEXT = r"""
+@article{duan2024gtbench,
+    title = {GTBench: Uncovering the Strategic Reasoning Limitations of LLMs via Game-Theoretic Evaluations},
+    author = {Duan, Jinhao and Zhang, Renming and Diffenderfer, James and Kailkhura, Bhavya and Sun, Lichao and Stengel-Eskin, Elias and Bansal, Mohit and Chen, Tianlong and Xu, Kaidi},
+    year = {2024},
+    journal={arXiv preprint 2402.12348}
+}
 """