Data and results for the paper: "From Bugs to Benefits: Improving User Stories by Leveraging Crowd Knowledge with CrUISE-AC"

NIAID Data Ecosystem2026-05-02 收录

下载链接：

https://zenodo.org/record/12778788

下载链接

链接失效反馈

官方服务：

资源简介：

We provide the following files used in the study "From Bugs to Benefits: Improving User Stories by Leveraging Crowd Knowledge with CrUISE-AC". The paper has been accepted for presentation in the research track of the IEEE/ACM International Conference on Software Engineering (ICSE) 2025 and will be included in the conference proceedings. The preprint is available on arXiv. User stories e-commerce.xlsx 307 real-world user stories from 3 different eCommerce projects. Project A defines a complete set of requirements for a B2C focused onlineshop of a publishing house who aims do sell his own publications directly. Project B contains a partial set of requirements for a B2C focused onlineshop of a bookseller. Project C includes a subset of B2C and B2B requirements for an online bookstore, supplemented by an eProcurement module designed to provide information and automation for industrial customers. Most of the user stories come with additional acceptance criteria, written in unstructured natural language. The user stories have been anonymized and the merchant's real names were replaced with neutral terms. Columns - ID: a unique ID we assigned across all projects - Project: user story belongs to project A, B or C - Connextra: user story in connextra pattern - Acceptance Criteria: acceptance criteria that came with the user story User stories CMS.xlsx 34 CMS related user stories from a dataset that was originally created by *Lucassen, G., Dalpiaz, F., van der Werf, J.M.E., Brinkkemper, S.: Visualizing user story requirements at multiple granularity levels via semantic relatedness. In: Con- ceptual Modeling: 35th International Conference, ER 2016, Gifu, Japan, November 14-17, 2016, Proceedings 35. pp. 463–478. Springer (2016)* Columns - ID: a unique ID we assigned - Connextra: user story in connextra pattern Issues e-commerce.xlsx 54,396 issues, we harvested from seven different issue trackers between June 2011 and July 2024 - magento2 (https://github.com/magento/magento2/issues) - nopCommerce (https://github.com/nopSolutions/nopCommerce/issues) - OpenCart (https://github.com/opencart/opencart/issues) - PrestaShop (https://github.com/PrestaShop/PrestaShop/issues) - Shopware5 (https://issues.shopware.com/?products=SW-5) - Shopware6 (https://issues.shopware.com/?products=SW-6) - WooCommerce (https://github.com/woocommerce/woocommerce/issues) Columns - id: unique ID we have assigned - Issue Tracker: issue tracker this issue originates from - Title: title of the original issue - Body: body / description of the original issue - Preprocessed: result of preprocessing the issue as described in the paper - Sample: issue was part of our 3,500 sample issues we used to evaluate CrUISE-AC Issues CMS.xlsx 64,500 issues, we harvested from two different issue trackers between April 2002 and August 2024 - Moodle (https://github.com/magento/magento2/issues) - Umbraco (https://github.com/nopSolutions/nopCommerce/issues) Columns are the the same as for "Issues e-commerce.xlsx" trivia-trainingdata.csv Manually labelled dataset to train the trivia classifier. The dataset contains 1916 phrases with an even distribution of 958 trivia and 958 non-trivia phrases. - Label = 1: this sentence is trivia - Label = 0: this sentence is not considered trivia Any source code was replaced by [CODE] to simplify the classification process. Source code in markdown could be identified easily as it is enclosed by a special character https://docs.github.com/en/get-started/writing-on-github/working-with-advanced-formatting/creating-and-highlighting-code-blocks Prompts prompt_match.txt: prompt we used across all LLMs to assess, if an issue potentially might affect a given user story prompt_generate.txt: GPT4-turbo prompt to convert an issue text into gherkin-style acceptance criteria for a given user story prompt_evaluate.txt: GPT4-turbo prompt to assess the usefulness of a newly generated acceptance criteria for a given user story Evaluation e-commerce.xlsx issue / user story pairs, generated acceptance criteria and result of manual evaluation. Columns - StoryID: unique ID of the user story (refer to User stories e-commerce.xlsx) - IssueID: unique ID of the issue (refer to Issues e-commerce.xlsx) - Issue: preprocessed issue text used as basis to generate the acceptance criterion - Connextra: user story in connextra pattern - Existing AC: acceptance criteria that originally came with the user story - AC: by CrUISE-AC generated acceptance criterion - AC_Explanation: explanation generated by CrUISE-AC why this AC adds new knowledge to the current user story - E1: evaluation result by expert 1 (1 = AC adds relevant knowledge; 0 = AC is irrelevant) - E2: evaluation result by expert 2 (1 = AC adds relevant knowledge; 0 = AC is irrelevant) - E3: evaluation result by expert 3 (1 = AC adds relevant knowledge; 0 = AC is irrelevant) - E4: evaluation result by expert 4 (1 = AC adds relevant knowledge; 0 = AC is irrelevant) - 3/4 majority: did at least 3 experts assess this AC as relevant (1 = yes; 0 = no) Evaluation CMS.xlsx - StoryID: unique ID of the user story (refer to User stories CMS.xlsx) - IssueID: unique ID of the issue (refer to Issues CMS.xlsx) - Issue: preprocessed issue text used as basis to generate the acceptance criterion - Connextra: user story in connextra pattern - AC: by CrUISE-AC generated acceptance criterion - AC_Explanation: explanation generated by CrUISE-AC why this AC adds new knowledge to the current user story - E1: evaluation result by expert 1 (1 = AC adds relevant knowledge; 0 = AC is irrelevant) - E4: evaluation result by expert 4 (1 = AC adds relevant knowledge; 0 = AC is irrelevant) - E5: evaluation result by expert 5 (1 = AC adds relevant knowledge; 0 = AC is irrelevant) - 2/3 majority: did at least 2 experts assess this AC as relevant (1 = yes; 0 = no)

创建时间：

2025-01-31