five

"Cybersecurity, Cloud Computing, and IT Support Q&A Dataset"

收藏
DataCite Commons2025-08-22 更新2026-05-03 收录
下载链接:
https://ieee-dataport.org/documents/cybersecurity-cloud-computing-and-it-support-qa-dataset
下载链接
链接失效反馈
官方服务:
资源简介:
"This paper presents a comprehensive English cybersecurity question-answer dataset designed to serve as a knowledge base for Retrieval-Augmented Generation (RAG) systems and cybersecurity support applications. The dataset comprises 2,650 unique question-answer pairs covering three critical domains: cybersecurity (40%), cloud computing (35%), and IT support (25%). This resource addresses the need for high-quality, domain-specific knowledge bases that can power AI-driven cybersecurity assistance systems.The dataset encompasses diverse cybersecurity scenarios, including incident response procedures, security threat analysis, network security protocols, digital forensics, authentication systems, cloud security frameworks, compliance standards (GDPR, HIPAA, SOX, PCI-DSS), and technical support workflows. Each question-answer pair covers varying complexity levels, from basic troubleshooting queries to sophisticated cybersecurity incident management, incorporating domain-specific terminology such as CVE references, MITRE ATT&CK framework classifications, OWASP categories, and ENISA guidelines.The knowledge base was systematically constructed through expert curation and comprehensive preprocessing pipelines that preserve semantic integrity while ensuring compatibility with modern AI systems. Questions range from simple factual inquiries to complex multi-faceted problem-solving scenarios, while answers provide actionable guidance with appropriate technical depth. The dataset includes technical acronyms, multi-step procedural descriptions, vulnerability classifications, and compliance terminology that reflect real-world cybersecurity professional needs.Each entry follows a structured format with clearly defined question and answer components, making it suitable for training and evaluating retrieval-augmented generation systems, question-answering models, and knowledge-based chatbots focused on cybersecurity domains. The preprocessing includes text cleaning, encoding standardization, and semantic integrity preservation to ensure high-quality inputs for downstream AI applications.This dataset enables researchers and practitioners to develop and evaluate AI systems capable of providing accurate, context-aware cybersecurity support. The resource supports research in information retrieval, domain-specific natural language processing, and cybersecurity automation applications. By providing a comprehensive English knowledge base spanning multiple cybersecurity domains, this dataset facilitates the development of intelligent cybersecurity assistance systems that can help organizations manage security challenges more effectively.The dataset is released to promote research in cybersecurity AI systems and to support the development of more accessible and effective cybersecurity knowledge management tools. This resource represents one of the most comprehensive cybersecurity QA datasets available, contributing valuable domain-specific content to both the cybersecurity and natural language processing research communities.Keywords: Cybersecurity Dataset, Question-Answering, Knowledge Base, Retrieval-Augmented Generation, Natural Language Processing, IT Support, Cloud Computing Security, Incident Response, Technical Documentation, AI Training Data"
提供机构:
IEEE DataPort
创建时间:
2025-08-22
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作