five

Towards a Semantic Representation for Functional Software Requirements (MARP-5 Dataset + Req2Vec Code)

收藏
NIAID Data Ecosystem2026-03-14 收录
下载链接:
https://zenodo.org/records/6349908
下载链接
链接失效反馈
官方服务:
资源简介:
Please cite this dataset as: Sonbol, R., Rebdawi, G. and Ghneim, N., 2020, September. Towards a Semantic Representation for Functional Software Requirements. In 2020 IEEE Seventh International Workshop on Artificial Intelligence for Requirements Engineering (AIRE) (pp. 1-8). IEEE. https://ieeexplore.ieee.org/abstract/document/9233034/ This dataset (MARP-5) consists of 5,852 pairs of requirements (constructed based on a publicly available set of user stories created by Duke University). We annotated MARP-5 based on a 5-points Likert scale:(Extremely related, Very related, Somewhat related, Not very related, Not at all related). The dataset was independently annotated by two annotators with graduate school educations. The inter-annotator agreement (Cohen’s kappa) between these two reaches 0.73 with a percentage agreement of 88.7% which represents a substantial agreement level. Finally, a third annotator (the first author of this paper) resolved conflicts to produce the final datasets. The paper associated to the dataset "Towards a Semantic Representation for Functional Software Requirements" can be found here: https://ieeexplore.ieee.org/abstract/document/9233034/ In this paper, we propose a semantic representation, called ReqVec, for functional software requirements. ReqVec is calculated based on three main phases: First, a set of lexical and syntactic steps are performed to analyze textual requirements. Then, semantic dimensions for requirements are calculated based on a words classifier and the well-known word embedding model Word2vec. Finally, ReqVec is constructed based on the representations of these dimensions. Two experiments have been conducted to evaluate how the proposed ReqVec can capture meaningful semantic information to solve two well-known Requirements Engineering tasks: detecting semantic relation between requirements, and requirements categorization. The proposed representation was efficient enough to detect related requirements with 0.92 F-measure (using MARP-5 dataset) and to categorize requirements with 0.88 F-measure.
创建时间:
2022-09-26
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作