Software Projects User Stories Estimates with PRED & MMRE
收藏NIAID Data Ecosystem2026-05-10 收录
下载链接:
https://data.mendeley.com/datasets/ytmwsf2997
下载链接
链接失效反馈官方服务:
资源简介:
This dataset contains empirical software effort estimation data collected from ten completed industrial web and mobile application projects, comprising over 100 user stories, developed by professional software teams. The data was gathered as part of an empirical study investigating the accuracy of human-based and Generative AI based software effort estimation techniques. The projects span multiple application domains, including finance, healthcare, on-demand services, AI-enabled systems, and design platforms, and represent real-world agile development environments. All projects were completed prior to data collection, allowing the inclusion of actual recorded effort for each user story. For each user story, the dataset includes:
Project and Story Identifiers
Unique identifiers for projects and their corresponding user stories, enabling traceability and project-level aggregation.
Actual Effort
The real development effort spent on each user story, recorded from project logs and expressed in person-hours (PH).
Single-Point Expert Judgment Estimate (EJ(SP))
An effort estimate provided by an experienced practitioner who was directly involved in the corresponding project, serving as an industrial baseline.
Planning Poker Estimate (PP)
Consensus-based effort estimates produced by trained, balanced teams of experienced software professionals using the Planning Poker technique and a modified Fibonacci scale.
Generative AI–Based Estimates
Independent single-point effort estimates generated for each user story using four widely used Generative AI tools:
ChatGPT, DeepSeek, Microsoft Copilot & Google Gemini
All AI tools were prompted using an identical, structured prompt protocol and zero-shot learning setup to ensure fairness and reproducibility.
The dataset is suitable for:
Comparative evaluation of human vs. AI-based effort estimation, Analysis of estimation accuracy using metrics such as MMRE and PRED(25).
Research on agile estimation techniques, large language models, and requirements-based effort prediction, Replication studies and benchmarking of emerging AI estimation approaches. By providing real industrial data with multiple estimation perspectives and ground-truth effort values, this dataset supports reproducible and transparent research in software effort estimation and AI-assisted project planning.
创建时间:
2026-01-21



