five

Engineered dataset for Cross-Project Requirement Traceability in Natural Language Artefacts

收藏
NIAID Data Ecosystem2026-05-10 收录
下载链接:
https://data.mendeley.com/datasets/rmdxf6g7pg
下载链接
链接失效反馈
官方服务:
资源简介:
The compiled dataset for cross-project requirement traceability by leveraging contrastive learning techniques on natural language artefacts contains 15,872 total requirements across 37 projects and 7,624 validated cross-project links, with multiple Excel sheets for different data views. Data sources replicated from include: (1) Open source repositories (25 projects); (2) An Industrial dataset (12 proprietary projects) with 3 industry partners with 20-35 requirements per project; and (3) Benchmark Datasets- (a) PURE: 79 smaller research projects (5-15 requirements each) and PROMISE NFR: 15 projects focused on non-functional requirements (40 requirements each) with comprehensive NFR coverage spread across the categories: performance, security, usability, reliability, scalability, and maintainability. The implemented dataset evaluates traceability links across different projects, thereby contributing to both software engineering and natural language processing domains by establishing a more robust approach to cross-project traceability that can support knowledge transfer and reuse across software projects. Features of the dataset: Multiple data sheets in the Excel file for detailed analysis of requirements. Cross-project relationships with confidence scoring and validation status Temporal data with creation dates and project timelines Multi-dimensional classification (functional/non-functional, priority, complexity) Stakeholder attribution and tagging system
创建时间:
2025-10-02
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作