five

ProtoQA

收藏
arXiv2020-10-28 更新2024-06-21 收录
下载链接:
https://github.com/iesl/protoqa-data
下载链接
链接失效反馈
官方服务:
资源简介:
ProtoQA是一个专为训练和评估人工智能系统在典型常识推理情境下的能力而设计的新型问答数据集。该数据集包含9700个问题,每个问题有7-8个标记答案类别,旨在模拟人类通过共享经验轻松回答问题的能力。数据集的训练部分来源于国际游戏节目FAMILY-FEUD中的问题,而评估集则是通过收集100名众包工作者的答案创建的。ProtoQA不仅关注答案的正确性,还强调答案的多样性和排名,适用于对话系统等多响应场景。

ProtoQA is a novel question answering dataset designed for training and evaluating the capabilities of AI systems in typical common-sense reasoning scenarios. The dataset comprises 9,700 questions, each with 7 to 8 annotated answer categories, aiming to simulate the ability of humans to answer questions effortlessly through shared experiences. The training split of the dataset is sourced from questions in the international game show Family Feud, while the evaluation set is created by collecting answers from 100 crowdworkers. ProtoQA not only focuses on the correctness of answers but also emphasizes answer diversity and ranking, making it suitable for multi-response scenarios such as dialogue systems.
提供机构:
马萨诸塞大学阿默斯特分校信息与计算机科学学院
创建时间:
2020-05-02
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作