File size: 4,594 Bytes
6bec1f5 3264c5a a52f5ac 3264c5a 6bec1f5 a52f5ac 16d0da9 6bec1f5 16d0da9 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 |
about_olas_predict_benchmark = """\
How good are LLMs at making predictions about events in the future? This is a topic that hasn't been well explored to date.
[Olas Predict](https://olas.network/services/prediction-agents) aims to rectify this by incentivizing the creation of agents that make predictions about future events (through prediction markets).
These agents are tested in the wild on real-time prediction market data, which you can see on [here](https://huggingface.co/datasets/valory/prediction_market_data) on HuggingFace (updated weekly).\
However, if you want to create an agent with new tools, waiting for real-time results to arrive is slow. This is where the Olas Predict Benchmark comes in. It allows devs to backtest new approaches on a historical event forecasting dataset (refined from [Autocast](https://arxiv.org/abs/2206.15474)) with high iteration speed.
π π§ The autocast dataset resolved-questions are from a timeline ending in 2022, so the models might be trained on some of these data. Thus the current reported accuracy measure might be an in-sample forecasting one.
However, we can learn about the relative strengths of the different approaches (e.g models and logic), before testing the most promising ones on real-time unseen data.
This HF Space showcases the performance of the various models and workflows (called tools in the Olas ecosystem) for making predictions, in terms of accuracy and cost.\
π€ Pick a tool and run it on the benchmark using the "π₯ Run the Benchmark" page! (This feature is temporarily disabled due to an error in HF Spaces)
"""
about_the_tools = """\
- [Prediction Offline](https://github.com/valory-xyz/mech/blob/main/packages/valory/customs/prediction_request/prediction_request.py) - Uses prompt engineering, but no web crawling, to make predictions
- [Prediction Online](https://github.com/valory-xyz/mech/blob/main/packages/valory/customs/prediction_request/prediction_request.py) - Uses prompt engineering, as well as web crawling, to make predictions
- [Prediction with RAG](https://github.com/valory-xyz/mech/blob/main/packages/napthaai/customs/prediction_request_rag/prediction_request_rag.py) - Uses retrieval-augment-generation (RAG) over extracted search result to make predictions.
- [Prediction with Reasoning](https://github.com/valory-xyz/mech/blob/main/packages/napthaai/customs/prediction_request_reasoning/prediction_request_reasoning.py) - Incorporates an additional call to the LLM to do reasoning over retrieved data.
"""
about_the_dataset = """\
## Dataset Overview
This project leverages the Autocast dataset from the research paper titled ["Forecasting Future World Events with Neural Networks"](https://arxiv.org/abs/2206.15474).
The dataset has undergone further refinement to enhance the performance evaluation of Olas mech prediction tools.
Both the original and refined datasets are hosted on HuggingFace.
### Refined Dataset Files
- You can find the refined dataset on HuggingFace [here](https://huggingface.co/datasets/valory/autocast).
- `autocast_questions_filtered.json`: A JSON subset of the initial autocast dataset.
- `autocast_questions_filtered.pkl`: A pickle file mapping URLs to their respective scraped documents within the filtered dataset.
- `retrieved_docs.pkl`: Contains all the scraped texts.
### Filtering Criteria
To refine the dataset, we applied the following criteria to ensure the reliability of the URLs:
- URLs not returning HTTP 200 status codes are excluded.
- Difficult-to-scrape sites, such as Twitter and Bloomberg, are omitted.
- Links with less than 1000 words are removed.
- Only samples with a minimum of 5 and a maximum of 20 working URLs are retained.
### Scraping Approach
The content of the filtered URLs has been scraped using various libraries, depending on the source:
- `pypdf2` for PDF URLs.
- `wikipediaapi` for Wikipedia pages.
- `requests`, `readability-lxml`, and `html2text` for most other sources.
- `requests`, `beautifulsoup`, and `html2text` for BBC links.
"""
about_olas_predict = """\
Olas is a network of autonomous services that can run complex logic in a decentralized manner, interacting with on- and off-chain data autonomously and continuously. For other use cases check out [olas.network](https://olas.network/).
Since 'Olas' means 'waves' in Spanish, it is sometimes referred to as the 'ocean of services' π.
The project is co-created by [Valory](https://www.valory.xyz/). Valory aspires to enable communities, organizations and countries to co-own AI systems, beginning with decentralized autonomous agents.
"""
|