valory/autocast

Name: valory/autocast
Creator: valory
Published: 2024-04-10 13:40:52
License: 暂无描述

Hugging Face2024-04-10 更新2024-03-04 收录

下载链接：

https://hf-mirror.com/datasets/valory/autocast

下载链接

链接失效反馈

官方服务：

资源简介：

# Autocast This is the Autocast dataset from the paper "[Forecasting Future World Events with Neural Networks](http://arxiv.org/abs/2206.15474)" by [Andy Zou](https://andyzoujm.github.io/), [Tristan Xiao](https://www.linkedin.com/in/tristan-xiao/), [Ryan Jia](https://www.linkedin.com/in/ryanjia/), [Joe Kwon](joekwon.io), [Mantas Mazeika](https://www.linkedin.com/in/mmazeika/), [Richard Li](https://www.linkedin.com/in/lirichard23/), [Dawn Song](https://people.eecs.berkeley.edu/~dawnsong/), [Jacob Steinhardt](https://www.stat.berkeley.edu/~jsteinhardt/), [Owain Evans](https://owainevans.github.io/), and [Dan Hendrycks](https://danhendrycks.com/). The original dataset files are: - `autocast_questions.json` - `autocast_competition_test_set.json` - `negated_tf_questions.json` We have also processed the dataset to filter out source links with: - URLs returning non-200 HTTP status codes - URLs from sites that are difficult to scrape like twitter, bloomberg - Links with less than 1000 words are removed. Only samples with a minimum of 5 working URLs are retained. The maximum number of working source links is 20. The refined dataset files are: - `autocast_questions_filtered.json` - a JSON subset of the initial autocast dataset. - `autocast_questions_filtered.pkl` - a pickle file mapping URLs to the scraped data. - `retrieved_docs.pkl` - this contains all texts that were retrieved. <img align="center" src="assets/splash.png" width="750"> # Forecasting Future World Events with Neural Networks ## Introduction Forecasting future world events is a challenging but valuable task. Forecasts of climate, geopolitical conflict, pandemics and economic indicators help shape policy and decision making. In these domains, the judgment of expert humans contributes to the best forecasts. Given advances in language modeling, can these forecasts be automated? To this end, we introduce Autocast, a dataset containing thousands of forecasting questions and an accompanying news corpus. Questions are taken from forecasting tournaments, ensuring high quality, real-world importance, and diversity. The news corpus is organized by date, allowing us to precisely simulate the conditions under which humans made past forecasts (avoiding leakage from the future). We test language models on our forecasting task and find that performance is far below a human expert baseline. However, performance improves with increased model size and incorporation of relevant information from the news corpus. In sum, Autocast poses a novel challenge for large language models and improved performance could bring large practical benefits. ## Autocast Dataset The original [Autocast dataset can be downloaded here](https://people.eecs.berkeley.edu/~hendrycks/autocast.tar.gz). For more details on how to use the Autocast dataset and news articles, please refer to the short demonstration in `usage.ipynb` at the [repository](https://github.com/andyzoujm/autocast) of the authors. Each question has the following fields: ```json { "id": "unique identifier (str)", "question": "question body (str)", "background": "question context/details (str)", "qtype": "question type (str)", "status": "question status (str)", "choices": "choices or possible ranges (List or Dict)", "answer": "question resolution (str or float)", "crowd": "human crowd forecasts over time (List)", "publish_time": "publish timestamp (str)", "close_time": "close timestamp (str)", "prediction_count": "number of crowd predictions (int)", "forecaster_count": "number of crowd forecasters (int)", "tags": "question category (List)", "source_links": "source links from comments (List)" } ``` The original authors obtained permission from [Metaculus](https://www.metaculus.com/) to host the dataset on GitHub for research purposes only. ## IntervalQA Dataset Motivated by the difficulty of forecasting numbers across orders of magnitude (e.g. global cases of COVID-19 in 2022), the original authors also curate IntervalQA, a dataset of numerical questions and metrics for calibration. [Download the IntervalQA dataset here](https://people.eecs.berkeley.edu/~hendrycks/intervalqa.tar.gz). ## Citation If you find this useful in your research, please consider citing the original authors: @article{zouforecasting2022, title={Forecasting Future World Events with Neural Networks}, author={Andy Zou and Tristan Xiao and Ryan Jia and Joe Kwon and Mantas Mazeika and Richard Li and Dawn Song and Jacob Steinhardt and Owain Evans and Dan Hendrycks}, journal={NeurIPS}, year={2022} }

提供机构：

valory

原始信息汇总

Autocast 数据集

数据集概述

Autocast 数据集来自论文 "Forecasting Future World Events with Neural Networks"，由 Andy Zou 等人提出。该数据集包含数千个预测问题和一个相关的新闻语料库，旨在测试语言模型在预测未来世界事件方面的性能。

数据文件

原始数据集文件包括：

autocast_questions.json
autocast_competition_test_set.json
negated_tf_questions.json

经过处理的精炼数据集文件包括：

autocast_questions_filtered.json - 初始 autocast 数据集的 JSON 子集。
autocast_questions_filtered.pkl - 将 URL 映射到抓取数据的 pickle 文件。
retrieved_docs.pkl - 包含所有检索到的文本。

数据集处理

数据集经过以下过滤处理：

移除返回非 200 HTTP 状态码的 URL。
移除难以抓取的网站（如 Twitter、Bloomberg）的链接。
移除少于 1000 字的链接。
仅保留至少有 5 个有效 URL 的样本，最多保留 20 个有效源链接。

数据集字段

每个问题包含以下字段： json { "id": "唯一标识符 (str)", "question": "问题主体 (str)", "background": "问题背景/详情 (str)", "qtype": "问题类型 (str)", "status": "问题状态 (str)", "choices": "选项或可能范围 (List 或 Dict)", "answer": "问题解析 (str 或 float)", "crowd": "人群预测随时间变化 (List)", "publish_time": "发布时间戳 (str)", "close_time": "关闭时间戳 (str)", "prediction_count": "人群预测数量 (int)", "forecaster_count": "人群预测者数量 (int)", "tags": "问题类别 (List)", "source_links": "评论中的源链接 (List)" }

数据集下载

原始 Autocast 数据集可从以下链接下载： Autocast 数据集下载链接

引用

如需引用该数据集，请参考以下格式：

@article{zouforecasting2022, title={Forecasting Future World Events with Neural Networks}, author={Andy Zou and Tristan Xiao and Ryan Jia and Joe Kwon and Mantas Mazeika and Richard Li and Dawn Song and Jacob Steinhardt and Owain Evans and Dan Hendrycks}, journal={NeurIPS}, year={2022} }

5,000+

优质数据集

54 个

任务类型

进入经典数据集