five

Machine Learning for Software Engineering: A Tertiary Study

收藏
NIAID Data Ecosystem2026-03-14 收录
下载链接:
https://zenodo.org/record/5715474
下载链接
链接失效反馈
官方服务:
资源简介:
Dataset of the research paper: Machine Learning for Software Engineering: A Tertiary Study Machine learning (ML) techniques increase the effectiveness of software engineering (SE) lifecycle activities. We systematically collected, quality-assessed, summarized, and categorized 83 reviews in ML for SE published between 2009–2022, covering 6,117 primary studies. The SE areas most tackled with ML are software quality and testing, while human-centered areas appear more challenging for ML. We propose a number of ML for SE research challenges and actions including: conducting further empirical validation and industrial studies on ML; reconsidering deficient SE methods; documenting and automating data collection and pipeline processes; reexamining how industrial practitioners distribute their proprietary data; and implementing incremental ML approaches. The following data and source files are included. review-protocol.md: The protocol employed in this tertiary study data/   dl-search/     input/ acm_comput_surveys_overviews.bib: Surveys of ACM Computing Surveys journal acm_comput_surveys_overviews_titles.txt: Titles of surveys acm_comput_ml_surveys.bib: Machine learning (ML)-related surveys of ACM Computing Surveys journal acm_comput_ml_surveys_titles.txt: Titles of ML-related surveys dl_search_queries.txt: Search queries applied to IEEE Xplore, ACM Digital Library, and Elsevier Scopus ml_keywords.txt: ML-related keywords extracted from ML-related survey titles and used in the search queries se_keywords.txt: Software Engineering (SE)-related keywords derived from the 15 SWEBOK Knowledge Areas (KAs—except for Computing Foundations, Mathematical Foundations, and Engineering Foundations) and used in the search queries secondary_studies_keywords.txt: Survey-related keywords composed of the 15 keywords introduced in the tertiary study on SLRs in SE by Kitchenham et al. (2010), and the survey titles, and used in the search queries     output/ acm/ acm{1–9}.bib: Search results from ACM Digital Library ieee.csv: Search results from IEEE Xplore scopus_analyze_year.csv: Yearly distribution of ML and SE documents extracted from Scopus's Analyze search results page scopus.csv: Search results from Scopus   study-selection/ backward_snowballing.csv: Additional secondary studies found through the backward snowballing process backward_snowballing_references.csv: References of quality-accepted secondary studies cohen_kappa_agreement.csv: Inter-rater reliability of reviewers in study selection dl_search_results.csv: Aggregated search results of all three digital libraries forward_snowballing_reviewer_{1,2}.csv: Divided forward snowballing citations of quality-accepted studies assessed by reviewer 1 and 2, correspondingly, based on IC/EC study_selection_reviewer_{1,2}.csv: Divided search results assessed by reviewer 1 and 2, correspondingly, based on IC/EC   quality-assessment/ dare_assessment.csv: Quality assessment (QA) of selected secondary studies based on the Database of Abstracts of Reviews of Effects (DARE) criteria by York University, Centre for Reviews and Dissemination quality_accepted_studies.csv: Details of quality-accepted studies studies_for_review.bib: Bibliography details and QA scores of selected secondary studies   data-extraction/ further_research.csv: Recommendations for further research of quality-accepted studies further_research_general.csv: The complete list of associated studies for each general recommendation knowledge_areas.csv: Classification of quality-accepted studies using the SWEBOK KAs and subareas ml_techniques.csv: Classification of the quality-accepted studies based on a four-axis ML classification scheme, along with extracted ML techniques employed in the studies primary_studies.csv: Details of reviewed primary studies by the quality-accepted secondary research_methods.csv: Citations of the research methods employed by the quality-accepted studies research_types_methods.csv: Research types and methods employed by the quality-accepted studies src/ data-analysis.ipynb: Analysis of data extraction results (data preprocessing, top authors and institutions, study types, yearly distribution of publishers, QA scores, and SWEBOK KAs) and creation of all figures included in the study scopus-year-analysis.ipynb: Yearly distribution of ML and SE publications retrieved from Elsevier Scopus study-selection-preprocessing.ipynb: Processing of digital library search results to conduct the inter-rater reliability estimation and study selection process
创建时间:
2022-09-16
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作