Whoisjutanlee/cvetop100en
收藏Hugging Face2026-04-21 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/Whoisjutanlee/cvetop100en
下载链接
链接失效反馈官方服务:
资源简介:
---
language:
- en
license: apache-2.0
pretty_name: CVE Top 100 (English)
size_categories:
- 1K<n<10K
task_categories:
- question-answering
- text-classification
tags:
- cybersecurity
- ayinedjimi-consultants
- en
- cve
- vulnerabilities
- exploit
authors:
- name: Ayi NEDJIMI
url: https://ayinedjimi-consultants.fr/bio.html
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
source_datasets:
- original
dataset_info:
dataset_size: null
download_size: null
---
# CVE Top 100 Dataset - English Edition
A comprehensive bilingual dataset of the 100 most critical and exploited CVEs from 2014 to 2024, with educational Q&A on vulnerability management.
## Description
This dataset contains:
- **100 critical CVEs** documented with precise technical details
- **50 French questions-answers** on CVEs and vulnerability management
- **50 English questions-answers** for bilingual learning
- **Detailed statistics** on vulnerability trends
### Vulnerability Types Covered
- Remote Code Execution (RCE)
- Privilege Escalation
- Authentication Bypass
- Information Disclosure
- Denial of Service (DoS)
### Affected Systems
- Windows (XP, Vista, 7, 8, 10, Server 2003-2022)
- Linux (Kernel, distributions)
- Web Applications (Drupal, Joomla, WordPress, Struts2)
- Networking Equipment (Cisco, F5 BIG-IP, Citrix)
- Cloud Infrastructure (VMware, Kubernetes, Exchange, SharePoint)
## Dataset Contents
### 1. **cves.json** (100 CVEs)
Each CVE includes:
```json
{
"id": "CVE-2021-44228",
"name": "Log4Shell",
"description_fr": "Vulnérabilité critique dans Apache Log4j 2...",
"description_en": "Critical vulnerability in Apache Log4j 2...",
"cvss_score": 10.0,
"cvss_severity": "critical",
"affected_products": ["Apache Log4j 2.x"],
"attack_vector": "network",
"cwe_id": "CWE-917",
"exploit_available": true,
"actively_exploited": true,
"patch_available": true,
"discovery_year": 2021,
"mitigation_fr": "...",
"mitigation_en": "...",
"references_url": ["https://nvd.nist.gov/vuln/detail/CVE-2021-44228"]
}
```
### 2. **qa_dataset.json & qa_dataset.parquet**
50 French questions-answers + 50 English:
```json
{
"language": "English",
"language_code": "en",
"question": "What is a CVE?",
"answer": "A CVE (Common Vulnerabilities and Exposures)...",
"category": "CVE & Vulnerability Management"
}
```
### 3. **combined_dataset.json**
Merged dataset with complete metadata.
### 4. **statistics.json**
Dataset statistics:
- Distribution by severity (critical, high, medium)
- Distribution by year (2014-2024)
- Number of CVEs with active exploits
- Average CVSS score
- Attack vectors
## Notable CVEs
| CVE ID | Name | CVSS | Year | Status |
|--------|------|------|------|--------|
| CVE-2021-44228 | Log4Shell | 10.0 | 2021 | Actively Exploited |
| CVE-2017-0144 | EternalBlue | 9.8 | 2017 | Actively Exploited |
| CVE-2021-27065 | ProxyLogon | 9.8 | 2021 | Actively Exploited |
| CVE-2021-34527 | PrintNightmare | 8.8 | 2021 | Actively Exploited |
| CVE-2014-0160 | Heartbleed | 7.5 | 2014 | Patch Available |
| CVE-2014-6271 | Shellshock | 9.8 | 2014 | Patch Available |
| CVE-2020-1472 | ZeroLogon | 10.0 | 2020 | Actively Exploited |
| CVE-2019-0708 | BlueKeep | 9.8 | 2019 | Patch Available |
## Statistics
- **Total CVEs**: 100
- **Average CVSS Score**: 8.7
- **Critical CVEs**: 45
- **With Active Exploits**: 65
- **With Patches**: 98
- **Years Covered**: 2014-2024
## Use Cases
### For Security Teams
- Vulnerability management
- Patch prioritization
- Risk assessment
- Incident documentation
### For Education and Training
- Systems security courses
- CVE training programs
- Security certification (OSCP, CEH)
- Penetration testing workshops
### For Research
- Security trend analysis
- Vulnerability studies
- Exploit history
- Threat intelligence
## CVE Metadata
Each CVE includes:
- **CVE ID**: Unique identifier
- **Name**: Vulnerability nickname
- **Descriptions**: Bilingual (FR/EN)
- **CVSS Score**: Severity assessment (0-10)
- **Severity**: critical, high, medium, low
- **Affected Products**: List of software/versions
- **Attack Vector**: network, local, adjacent
- **CWE ID**: Weakness categorization
- **Exploit Available**: Yes/No
- **Actively Exploited**: Yes/No
- **Patch Available**: Yes/No
- **Discovery Year**: 2014-2024
- **Mitigations**: Recommended protective measures
- **References**: Links to security resources
## Questions and Answers
The dataset includes 100 Q&A covering:
### Fundamental Concepts
- What is a CVE?
- What is CVSS?
- What is a zero-day vulnerability?
### Vulnerability Types
- RCE (Remote Code Execution)
- Privilege Escalation
- SQL Injection
- XXE (XML External Entity)
- Buffer Overflow
- Race Conditions
### Security Management
- Patch Management
- Vulnerability Assessment
- Attack Surface Management
- System Hardening
- Network Segmentation
- Security Compliance
### Tools and Techniques
- WAF (Web Application Firewall)
- SIEM (Security Information and Event Management)
- Penetration Testing
- Threat Intelligence
- Risk Analysis
## Related Collections
- [CVE Top 100 French Edition](https://huggingface.co/datasets/AYI-NEDJIMI/cve-top100-fr)
- [Cybersecurity Threats Dataset](https://huggingface.co/datasets)
- [Vulnerability Management Training](https://huggingface.co/datasets)
## Other Datasets
| Dataset | Description | Language |
|---------|-------------|----------|
| [Malware Analysis Dataset](https://huggingface.co/datasets) | Malware Analysis | EN |
| [Network Intrusion Detection](https://huggingface.co/datasets) | Intrusion Detection | EN |
| [Social Engineering Dataset](https://huggingface.co/datasets) | Social Engineering | EN |
| [Phishing Detection](https://huggingface.co/datasets) | Phishing Detection | EN |
## Data Formats
### JSON
```bash
cves.json - Array of 100 CVEs
qa_dataset.json - Array of 100 Q&A pairs
combined_dataset.json - Complete structure with metadata
```
### Parquet
```bash
qa_dataset.parquet - Columnar optimized format
```
## Updates
This dataset is updated quarterly to include new critical CVEs and important security patches.
Last Updated: **2024-2025**
### Related Articles
- [Complete Guide to CVEs and Their Management](https://www.ayinedjimi-consultants.fr/articles/cve-management)
- [Top 10 Critical Vulnerabilities 2024](https://www.ayinedjimi-consultants.fr/articles/top-vulnerabilities)
- [Patch Management Strategies](https://www.ayinedjimi-consultants.fr/articles/patch-management)
- [Threat Intelligence and Cybersecurity](https://www.ayinedjimi-consultants.fr/articles/threat-intelligence)
- [Penetration Testing and Red Team Methodologies](https://www.ayinedjimi-consultants.fr/articles/pentest-redteam)
## SEO Tags
`cve` `vulnerabilities` `exploits` `cybersecurity` `patch-management` `vulnerability-management` `pentest` `red-team` `soc` `threat-intelligence` `zero-day` `rce` `privilege-escalation` `security` `dataset` `educational` `infosec` `opsec` `defense` `offensive-security`
## License
Creative Commons Attribution 4.0 International (CC-BY-4.0)
This dataset can be used for personal, educational, or commercial purposes with attribution.
## Disclaimer
This dataset is provided for educational and research purposes. Using CVE information to attack systems without authorization is ILLEGAL. Users are responsible for complying with local and international cybersecurity laws.
## Support
For questions, corrections, or suggestions:
- GitHub Issues: [link]
- Email: datasets@ayinedjimi-consultants.fr
- Forum: [link]
---
**Building Stronger Cybersecurity Together** 🛡️
Made with ❤️ by AYI-NEDJIMI Consultants
## Author
**Ayi NEDJIMI** - Cybersecurity Consultant & Trainer | AI Expert
- [Professional Bio](https://ayinedjimi-consultants.fr/bio.html)
- [All Articles](https://ayinedjimi-consultants.fr/articles.html)
- [Free Guides & Whitepapers](https://ayinedjimi-consultants.fr/guides-gratuits.html)
- [Training Programs](https://ayinedjimi-consultants.fr/formations.html)
## Related Articles
- [Top 10 Solutions EDR/XDR 2025](https://ayinedjimi-consultants.fr/top-10-solutions-edr-xdr-2025.html)
- [SSRF Moderne](https://ayinedjimi-consultants.fr/articles/techniques-hacking/ssrf-moderne.html)
- [Désérialisation & Gadgets](https://ayinedjimi-consultants.fr/articles/techniques-hacking/deserialisation-gadgets.html)
## Free Cybersecurity Resources
- [Livre Blanc NIS 2](https://ayinedjimi-consultants.fr/livre-blanc-nis2.html)
- [Livre Blanc Sécurité Active Directory](https://ayinedjimi-consultants.fr/livre-blanc-securite-active-directory.html)
- [Livre Blanc Pentest Cloud AWS/Azure/GCP](https://ayinedjimi-consultants.fr/livre-blanc-pentest-cloud-aws-azure-gcp.html)
- [Livre Blanc Sécurité Kubernetes](https://ayinedjimi-consultants.fr/livre-blanc-securite-kubernetes.html)
- [Livre Blanc IA Cyberdéfense](https://ayinedjimi-consultants.fr/livre-blanc-ia-cyberdefense.html)
- [Livre Blanc Anatomie Ransomware](https://ayinedjimi-consultants.fr/livre-blanc-anatomie-attaque-ransomware.html)
- [Guide Sécurisation AD 2025](https://ayinedjimi-consultants.fr/guide-securisation-active-directory-2025.html)
- [Guide Tiering Model AD](https://ayinedjimi-consultants.fr/livres-blancs/tiering-model/)
## Part of the Collection
This dataset is part of the [Cybersecurity Datasets & Tools Collection](https://huggingface.co/collections/AYI-NEDJIMI/cybersecurity-datasets-and-tools-by-ayi-nedjimi-698e4b5777848dba76c8b169) by AYI-NEDJIMI Consultants.
提供机构:
Whoisjutanlee



