Spaces:

valory
/

olas-prediction-leaderboard

Running

olas-prediction-leaderboard / tabs /faq.py

cyberosa

Adding timeline info of the autocast dataset

2a62f64 9 months ago

4.56 kB

	about_olas_predict_benchmark = """\
	How good are LLMs at making predictions about events in the future? This is a topic that hasn't been well explored to date.
	[Olas Predict](https://olas.network/services/prediction-agents) aims to rectify this by incentivizing the creation of agents that predict the future (through prediction markets).
	This is a leaderboard showing the performance of LLM tools for making predictions (event forecasting) on a dataset, refined from Autocast.\
	The leaderboard shows tool performance in terms of accuracy and cost. \

	🗓 🧐 The autocast dataset resolved-questions are from a timeline ending in 2022. Thus the current reported accuracy measure might be an in-sample forecasting one. We are working
	to incorporate soon an out-of-sample one using another dataset with unseen data.\

	🤗 Pick a tool and run it on the benchmark using the "🔥 Run the Benchmark" page!
	"""

	about_the_tools = """\
	- [Prediction Offline](https://github.com/valory-xyz/mech/blob/main/packages/valory/customs/prediction_request/prediction_request.py) - Uses prompt engineering, but no web crawling, to make predictions
	- [Prediction Online](https://github.com/valory-xyz/mech/blob/main/packages/valory/customs/prediction_request/prediction_request.py) - Uses prompt engineering, as well as web crawling, to make predictions
	- [Prediction SME](https://github.com/valory-xyz/mech/blob/main/packages/nickcom007/customs/prediction_request_sme/prediction_request_sme.py) - Use prompt engineering to get the LLM to act as a Subject Matter Expert (SME) in making a prediction.
	- [Prediction with RAG](https://github.com/valory-xyz/mech/blob/main/packages/napthaai/customs/prediction_request_rag/prediction_request_rag.py) - Uses retrieval-augment-generation (RAG) over extracted search result to make predictions.
	- [Prediction with Research Report](https://github.com/valory-xyz/mech/blob/main/packages/polywrap/customs/prediction_with_research_report/prediction_with_research_report.py) - Generates a research report before making a prediction.
	- [Prediction with Reasoning](https://github.com/valory-xyz/mech/blob/main/packages/napthaai/customs/prediction_request_reasoning/prediction_request_reasoning.py) - Incorporates an additional call to the LLM to do reasoning over retrieved data.
	- [Prediction with CoT](https://github.com/valory-xyz/mech/blob/main/packages/napthaai/customs/prediction_url_cot/prediction_url_cot.py) - Use Chain of Thought (CoT) to make predictions.
	"""

	about_the_dataset = """\
	## Dataset Overview
	This project leverages the Autocast dataset from the research paper titled ["Forecasting Future World Events with Neural Networks"](https://arxiv.org/abs/2206.15474).
	The dataset has undergone further refinement to enhance the performance evaluation of Olas mech prediction tools.
	Both the original and refined datasets are hosted on HuggingFace.
	### Refined Dataset Files
	- You can find the refined dataset on HuggingFace [here](https://huggingface.co/datasets/valory/autocast).
	- `autocast_questions_filtered.json`: A JSON subset of the initial autocast dataset.
	- `autocast_questions_filtered.pkl`: A pickle file mapping URLs to their respective scraped documents within the filtered dataset.
	- `retrieved_docs.pkl`: Contains all the scraped texts.
	### Filtering Criteria
	To refine the dataset, we applied the following criteria to ensure the reliability of the URLs:
	- URLs not returning HTTP 200 status codes are excluded.
	- Difficult-to-scrape sites, such as Twitter and Bloomberg, are omitted.
	- Links with less than 1000 words are removed.
	- Only samples with a minimum of 5 and a maximum of 20 working URLs are retained.
	### Scraping Approach
	The content of the filtered URLs has been scraped using various libraries, depending on the source:
	- `pypdf2` for PDF URLs.
	- `wikipediaapi` for Wikipedia pages.
	- `requests`, `readability-lxml`, and `html2text` for most other sources.
	- `requests`, `beautifulsoup`, and `html2text` for BBC links.
	"""

	about_olas_predict = """\
	Olas is a network of autonomous services that can run complex logic in a decentralized manner, interacting with on- and off-chain data autonomously and continuously. For other use cases check out [olas.network](https://olas.network/).
	Since 'Olas' means 'waves' in Spanish, it is sometimes referred to as the 'ocean of services' 🌊.
	The project is co-created by [Valory](https://www.valory.xyz/). Valory aspires to enable communities, organizations and countries to co-own AI systems, beginning with decentralized autonomous agents.
	"""