众测服务质量评估数据集

Name: 众测服务质量评估数据集
Creator: 中国人民解放军陆军工程大学
License: 暂无描述

国家基础学科公共科学数据中心2024-03-05 收录

下载链接：

https://www.nbsdc.cn/general/dataDetail?id=64ef8542bb16e0591d025784&type=1

下载链接

链接失效反馈

官方服务：

资源简介：

众测服务质量评估数据集主要面向众包测试服务质量评估的研究，依据国家重点专项《信息产品及科技服务集成化众测服务平台研发与应用》所属专项信息产品及科技服务集成化众测服务平台研发与应用中相关需求建设，基于集成化众测服务平台运行期数据产生，该数据集通过从众测服务平台的数据库中导出，进行数据清洗后，依据团体标准《信息技术众包测试众测服务质量评价》相关指标定义，通过编写程序计算得到众测服务质量评估相关的指标。实现了指标抽取的自动化程序，主要记录了报告重复情况，对结果整合阶段中的测试报告重复情况进行评价，主要依据重复测试报告占总报告比例；缺陷罕见程度，主要依据发现该缺陷的工人数在实际执行该任务的工人数的占比；缺陷分布广度，依据报出缺陷的任务数占总任务数的百分比；缺陷分布深度，即同等严重程度的Bug在众包测试任务中的分布情况；文本正确性，对任务描述中的文本正确性进行评价，包括检查字词、标点、语法、语义等方面有无错误、歧义等，从文本分析的角度上对任务进行质量评估。

This crowdsourcing testing service quality evaluation dataset is primarily developed for research on crowdsourcing testing service quality assessment. It is constructed based on relevant requirements from the project of Research and Development and Application of Integrated Crowdsourcing Testing Service Platform for Information Products and Scientific and Technological Services, which falls under the National Key Research and Development Program. The dataset is generated from the operational data of the integrated crowdsourcing testing service platform: it is exported from the database of the crowdsourcing testing service platform, followed by data cleaning. Then, relevant indicators for crowdsourcing testing service quality evaluation are calculated via programmed scripts in accordance with the indicator definitions specified in the group standard "Information Technology - Crowdsourcing Testing - Quality Evaluation of Crowdsourcing Testing Services". An automated indicator extraction program is implemented, which mainly records test report repetition. The evaluation of test report repetition during the result integration stage is primarily based on the proportion of duplicate test reports to the total number of reports. Defect rarity is evaluated based on the proportion of workers who discovered a specific defect to the total number of workers actually performing the corresponding task. Defect distribution breadth is calculated as the percentage of tasks that reported defects relative to the total number of tasks. Defect distribution depth refers to the distribution of bugs with the same severity level across crowdsourcing testing tasks. Text correctness evaluates the correctness of text in task descriptions, including checking for errors, ambiguities and other issues in words, punctuation, grammar and semantics, to assess task quality from the perspective of text analysis.

提供机构：

中国人民解放军陆军工程大学

搜集汇总

数据集介绍

背景与挑战

背景概述

该数据集是面向众包测试服务质量评估的研究数据集，基于集成化众测服务平台运行数据，通过自动化程序计算得到包括报告重复情况、缺陷罕见程度等多项质量评估指标，数据量为18.67MB。

以上内容由遇见数据集搜集并总结生成