|
import gradio as gr |
|
import pandas as pd |
|
|
|
|
|
csv_file_path = "formatted_data.csv" |
|
|
|
|
|
df = pd.read_csv(csv_file_path) |
|
|
|
|
|
markdown_text = """ |
|
## Benchmark Overview |
|
- The benchmark evaluates the performance of Olas Predict tools on the Autocast dataset. |
|
- The dataset has been refined to enhance the evaluation of the tools. |
|
- The leaderboard shows the performance of the tools based on the refined dataset. |
|
- The script to run the benchmark is available in the repo [here](https://github.com/valory-xyz/olas-predict-benchmark). |
|
|
|
## How to run your tools on the benchmark |
|
- Fork the repo [here](https://github.com/valory-xyz/olas-predict-benchmark). |
|
- Git init the submodules and update the submodule to get the latest dataset `mech` tool. |
|
- `git submodule init` |
|
- `git submodule update --remote --recursive` |
|
- Include your tool in the `mech/packages` directory accordingly. |
|
- Guidelines on how to include your tool can be found [here](xxx). |
|
- Run the benchmark script. |
|
|
|
## Dataset Overview |
|
This project leverages the Autocast dataset from the research paper titled ["Forecasting Future World Events with Neural Networks"](https://arxiv.org/abs/2206.15474). |
|
The dataset has undergone further refinement to enhance the performance evaluation of Olas mech prediction tools. |
|
Both the original and refined datasets are hosted on HuggingFace. |
|
|
|
### Refined Dataset Files |
|
- You can find the refined dataset on HuggingFace [here](https://huggingface.co/datasets/valory/autocast). |
|
- `autocast_questions_filtered.json`: A JSON subset of the initial autocast dataset. |
|
- `autocast_questions_filtered.pkl`: A pickle file mapping URLs to their respective scraped documents within the filtered dataset. |
|
- `retrieved_docs.pkl`: Contains all the scraped texts. |
|
|
|
### Filtering Criteria |
|
To refine the dataset, we applied the following criteria to ensure the reliability of the URLs: |
|
- URLs not returning HTTP 200 status codes are excluded. |
|
- Difficult-to-scrape sites, such as Twitter and Bloomberg, are omitted. |
|
- Links with less than 1000 words are removed. |
|
- Only samples with a minimum of 5 and a maximum of 20 working URLs are retained. |
|
|
|
### Scraping Approach |
|
The content of the filtered URLs has been scraped using various libraries, depending on the source: |
|
- `pypdf2` for PDF URLs. |
|
- `wikipediaapi` for Wikipedia pages. |
|
- `requests`, `readability-lxml`, and `html2text` for most other sources. |
|
- `requests`, `beautifulsoup`, and `html2text` for BBC links. |
|
""" |
|
|
|
|
|
with gr.Blocks() as demo: |
|
gr.Markdown("# Olas Predict Benchmark") |
|
gr.Markdown("Leaderboard showing the performance of Olas Predict tools on the Autocast dataset and overview of the project.") |
|
gr.DataFrame(df) |
|
gr.Markdown(markdown_text) |
|
|
|
demo.launch() |