five

SearchQA

收藏
arXiv2017-06-11 更新2024-06-21 收录
下载链接:
https://github.com/nyu-dl/SearchQA
下载链接
链接失效反馈
官方服务:
资源简介:
SearchQA是由纽约大学数据科学中心创建的大型问答数据集,旨在通过结合搜索引擎的上下文信息,模拟真实世界中的问答过程。该数据集包含超过140,000个问答对,每个问答对平均附带49.6个来自Google搜索的文本片段。创建过程中,首先从J! Archive获取问答对,然后通过Google搜索获取相关文本片段,确保数据集的真实性和复杂性。SearchQA的应用领域广泛,主要用于机器理解和自动化问答系统的研究,旨在提高机器处理复杂问答任务的能力。

SearchQA is a large-scale question answering (QA) dataset developed by the Center for Data Science at New York University. It aims to simulate real-world question answering workflows by leveraging contextual information from search engines. This dataset contains over 140,000 QA pairs, with each pair accompanied by an average of 49.6 text snippets sourced from Google Search. During its construction, QA pairs were first obtained from J! Archive, followed by the retrieval of relevant text snippets via Google Search to ensure the dataset's authenticity and complexity. SearchQA has a wide range of application scenarios, and is primarily used in research on machine comprehension and automated question answering systems, with the goal of improving machines' capabilities in handling complex QA tasks.
提供机构:
纽约大学数据科学中心
创建时间:
2017-04-18
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作