Toloker Graph: Interaction of Crowd Annotators
收藏NIAID Data Ecosystem2026-03-14 收录
下载链接:
https://zenodo.org/record/7620795
下载链接
链接失效反馈官方服务:
资源简介:
The graph contains 11,758 nodes and 519,000 edges representing interactions between crowd annotators on a project labeled on the Toloka crowdsourcing platform (see the Toloka overview for the details on the used terminology).
Each node represents an individual annotator; nodes are provided with four numerical and three categorical features. An edge is drawn between a pair of annotators if they annotated the same task. Also, each node is provided with a label showing whether the annotator was banned on this project, or not.
Nodes are stored in the nodes.tsv file in the TSV format of the following structure:
id: unique identifier of the annotator
approved_rate: percentage of the approved labels of this annotator
skipped_rate: percentage of the skipped tasks of this annotator
expired_rate: percentage of the expired tasks of this annotator
rejected_rate: percentage of the rejected labels of this annotator
education: level of education as self-reported by this annotator (none, basic, middle, high)
english_profile: knowledge of English as self-reported by this annotator (0 for no, 1 for yes)
english_tested: whether the annotator passed the Toloka language test for English (0 for no, 1 for yes)
banned: whether the annotator was banned on this project (0 for no, 1 for yes)
The *_rate attributes should sum up to 1.
Edges are stored in the edges.tsv file in the TSV format of the following structure:
source: source identifier of the annotator
target: target identifier of the annotator
As the graph is undirected, source and target can be interchanged for the given pair of nodes.
创建时间:
2023-03-27



