five

WikiReading Recycled

收藏
arXiv2020-11-06 更新2024-06-21 收录
下载链接:
https://github.com/applicaai/multi-property-extraction
下载链接
链接失效反馈
官方服务:
资源简介:
WikiReading Recycled是由应用人工智能公司开发的一个公开数据集,专注于多属性提取任务。该数据集基于WikiReading数据集,但消除了其前身的已知缺陷。数据集包含约410万条样本,旨在通过提供一个人工标注的测试集来详细分析模型性能。WikiReading Recycled的应用领域包括自然语言处理中的信息提取和机器阅读理解,旨在解决现有数据集在数据质量和数量上的权衡问题,以及数据集中的噪声问题。

WikiReading Recycled is an open dataset developed by Applied AI Company, which focuses on the multi-attribute extraction task. This dataset is built upon the original WikiReading dataset, while eliminating the known defects of its predecessor. It contains approximately 4.1 million samples, and aims to enable detailed analysis of model performance by providing a manually annotated test set. The application domains of WikiReading Recycled include information extraction and machine reading comprehension in natural language processing, and it is designed to resolve the trade-off between data quality and quantity in existing datasets, as well as the noise problem within datasets.
提供机构:
应用人工智能公司
创建时间:
2020-11-06
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作