Data files used to study the distribution of growth in software systems
收藏Research Data Australia2024-12-14 收录
下载链接:
https://researchdata.edu.au/files-used-study-software-systems/14865
下载链接
链接失效反馈官方服务:
资源简介:
The evolution of a software system can be studied in terms of how various properties as reflected by software metrics change over time. Current models of software evolution have allowed for inferences to be drawn about certain attributes of the software system, for instance, regarding the architecture, complexity and its impact on the development effort. However, an inherent limitation of these models is that they do not provide any direct insight into where growth takes place. In particular, we cannot assess the impact of evolution on the underlying distribution of size and complexity among the various classes. Such an analysis is needed in order to answer questions such as 'do developers tend to evenly distribute complexity as systems get bigger?', and 'do large and complex classes get bigger over time?'. These are questions of more than passing interest since by understanding what typical and successful software evolution looks like, we can identify anomalous situations and take action earlier than might otherwise be possible. Information gained from an analysis of the distribution of growth will also show if there are consistent boundaries within which a software design structure exists. In our study of metric distributions, we focused on 10 different measures that span a range of size and complexity measures. The raw metric data (4 .txt files and 1 .log file in a .zip file measuring ~0.5MB in total) is provided as a comma separated values (CSV) file, and the first line of the CSV file contains the header. A detailed output of the statistical analysis undertaken is provided as log files generated directly from Stata (statistical analysis software).
软件系统的演化可通过软件指标所反映的各类属性随时间的变化展开研究。当前的软件演化模型已可针对软件系统的部分属性开展推断,例如其架构、复杂度以及对开发工作量的影响。然而这类模型存在一项固有局限:无法直接洞察演化的增长发生于何处。具体而言,我们无法评估演化对各类软件类间规模与复杂度的底层分布所产生的影响。要解答诸如“随着系统规模扩张,开发者是否会均匀分配复杂度?”以及“大型复杂类别的规模是否会随时间持续增长?”这类问题,这类分析不可或缺。这类研究的价值远超一时之兴:通过厘清典型且成功的软件演化模式,我们能够识别异常场景并更早采取干预措施。通过分析增长分布所获取的信息,还可揭示软件设计结构是否存在一致的边界范围。在本次针对指标分布的研究中,我们聚焦于10种涵盖多类规模与复杂度维度的度量指标。原始指标数据(压缩包内包含4个.txt文件与1个.log文件,总大小约0.5MB)以逗号分隔值(CSV)格式提供,该CSV文件的首行为表头信息。本次统计分析的详细结果,以直接由Stata(统计分析软件)生成的日志文件形式提供。
提供机构:
Swinburne University of Technology



