five

NetEaseCrowd

收藏
arXiv2024-03-11 更新2024-06-21 收录
下载链接:
https://github.com/fuxiAIlab/NetEaseCrowd-Dataset
下载链接
链接失效反馈
官方服务:
资源简介:
NetEaseCrowd是由网易公司基于其成熟的数据众包平台构建的大型数据集,旨在验证适用于在线部署的真相推理算法。该数据集包含约2413名工人、近100万项任务和600万条标注,数据收集历时约6个月。NetEaseCrowd不仅保留了每个标注的时间戳,还包含了多种类型的任务,这些任务根据工人的不同能力进行分类。数据集的构建过程精心设计,确保了数据的质量和多样性。NetEaseCrowd的应用领域广泛,主要用于解决在线真相推理问题,如任务分配和主动学习,旨在通过分析工人的能力随时间的变化,提高在线真相推理的效率和准确性。

NetEaseCrowd is a large-scale dataset constructed by NetEase based on its mature data crowdsourcing platform, aiming to validate truth inference algorithms suitable for online deployment. This dataset contains approximately 2413 workers, nearly 1 million tasks and 6 million annotations, with a data collection period of about 6 months. NetEaseCrowd not only retains the timestamp of each annotation but also includes multiple types of tasks classified according to the different abilities of workers. The dataset is meticulously designed during its construction to ensure data quality and diversity. NetEaseCrowd has a wide range of application scenarios, and is mainly used to solve online truth inference problems such as task assignment and active learning, with the goal of improving the efficiency and accuracy of online truth inference by analyzing the temporal changes of workers' abilities.
提供机构:
网易公司
创建时间:
2024-03-11
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作