five

Replication Data for: Communication networks do not predict success in attempts at peer production

收藏
NIAID Data Ecosystem2026-03-14 收录
下载链接:
https://doi.org/10.7910/DVN/48QC7B
下载链接
链接失效反馈
官方服务:
资源简介:
This replication dataset includes code and data to replicate the paper "Communication networks do not predict success in attempts at peer production". The data included are of three types: 1. A zipped tar file of compressed XML files of edits made to wikis. This includes the full text of every revision made to the 1430 wikis that were part of our analysis as of early 2010 (different wikis were collected at different times). Note: Due to the Dataverse's file size limit, this file is in two parts - wiki_com_networks-wiki_dump.tar.xz.partaa and wiki_com_networks-wiki_dump.tar.xz.partab To combine them run: cat wiki_com_networks-wiki_dump.tar.xz.part* > wiki_com_networks-wiki_dump.tar.xz 2. A zipped tar file of the wikiq TSV files with metadata about each edit, created using the wikiq parser (https://code.communitydata.science/mediawiki_dump_tools.git). Those wishing to convert the XML files into TSV files can use the wikiq parser. 3. Summary CSV files with data about the communication network and activity levels for each wiki---in other words, the data used for the analyses in the paper. Code for converting the TSV files into these summary CSV files is included. A more detailed description of how to replicate the figures and analyses from the paper is given in the README file included with the code.

本复现数据集包含用于复现论文《通信网络无法预测同伴生产尝试中的成功》的代码与相关数据。所涵盖的数据分为三类: 1. 包含维基编辑记录压缩XML文件的tar.xz压缩包。该压缩包包含截至2010年初纳入本次分析的1430个维基的所有修订版本完整文本,不同维基的采集时间存在差异。注意:受Dataverse学术数据存储平台的文件大小限制,该压缩包分为两部分,分别为wiki_com_networks-wiki_dump.tar.xz.partaa与wiki_com_networks-wiki_dump.tar.xz.partab。如需合并这两个分卷,可执行命令:cat wiki_com_networks-wiki_dump.tar.xz.part* > wiki_com_networks-wiki_dump.tar.xz 2. 包含每条编辑元数据的wikiq TSV(制表符分隔值)文件tar压缩包,该文件通过wikiq解析器(https://code.communitydata.science/mediawiki_dump_tools.git)生成。若需将XML文件转换为TSV文件,可使用该wikiq解析器。 3. 包含各维基通信网络与活跃度数据的汇总CSV(逗号分隔值)文件,即论文分析所使用的核心数据集。本数据集附带了将TSV文件转换为此类汇总CSV文件的代码。论文中图表与分析的复现详细步骤,详见代码包附带的README文件。
创建时间:
2023-01-17
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作