Spaces:
Running
Running
| TITLE = '<h1 align="center" id="space-title">π WebWalkerQA Leaderboard</h1>' | |
| INTRO_TEXT = f""" | |
| ## π About | |
| This leaderboard showcases the performance of models on the **WebWalkerQA benchmark**. WebWalkerQA is a collection of question-answering datasets designed to test models' ability to answer questions about web pages. | |
| """ | |
| HOW_TO = f""" | |
| ## ποΈ Data | |
| The WebWalkerQA dataset is available on π€ [Hugging Face](https://huggingface.co/datasets/callanwu/WebWalkerQA). It comprises **680 question-answer pairs**, each linked to a corresponding web page. The benchmark is divided into two key components: | |
| - **Agent π€οΈ** | |
| - **RAG-system π** | |
| ## π How to Submit Your Method | |
| ### π Submission Steps: | |
| To list your method's performance on this leaderboard, email **jialongwu@alibaba-inc.com** or **jialongwu@seu.edu.cn** with the following: | |
| 1. A JSONL file in the format: | |
| ```jsonl | |
| {{"question": "question_text", "prediction": "predicted_answer_text"}} | |
| ``` | |
| 2. Include the following details in your email: | |
| - **User Name** | |
| - **Type** (RAG-system or Agent) | |
| - **Method Name** | |
| Your method will be evaluated and added to the leaderboard. For reference, check out the [evaluation code](https://github.com/Alibaba-NLP/WebWalker/src/evaluate.py). | |
| We will evaluate the performance of your method and list it on the leaderboard. | |
| For reference, you can check the [evaluation code](https://github.com/Alibaba-NLP/WebWalker/src/evaluate.py). | |
| """ | |
| CREDIT = f""" | |
| ## π Credit | |
| This website is built using the following resources: | |
| - **Evaluation Code**: Langchain's cot_qa evaluator | |
| - **Leaderboard Code**: Huggingface4's open_llm_leaderboard | |
| """ | |
| CITATION = f""" | |
| ## π©Citation | |
| If this work is helpful, please kindly cite as: | |
| ```bigquery | |
| @article{{wu2025webwalker, | |
| title={{Webwalker: Benchmarking llms in web traversal}}, | |
| author={{Wu, Jialong and Yin, Wenbiao and Jiang, Yong and Wang, Zhenglin and Xi, Zekun and Fang, Runnan and Zhang, Linhai and He, Yulan and Zhou, Deyu and Xie, Pengjun and others}}, | |
| journal={{arXiv preprint arXiv:2501.07572}}, | |
| year={{2025}} | |
| }} | |
| ``` | |
| """ | |