five

lacg030175/CICIDS2017

收藏
Hugging Face2026-04-02 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/lacg030175/CICIDS2017
下载链接
链接失效反馈
官方服务:
资源简介:
--- language: - en license: cc-by-4.0 size_categories: - 1M<n<10M task_categories: - tabular-classification tags: - network-intrusion-detection - cybersecurity - CICIDS2017 - IDS - binary-classification pretty_name: CICIDS2017 Network Intrusion Detection configs: - config_name: temporal_3way data_files: - split: train path: temporal_3way/train-* - split: test path: temporal_3way/test-* - split: validation path: temporal_3way/validation-* default: true - config_name: random_3way data_files: - split: train path: random_3way/train-* - split: test path: random_3way/test-* - split: validation path: random_3way/validation-* - config_name: temporal data_files: - split: train path: temporal/train-* - split: test path: temporal/test-* - config_name: standard data_files: - split: train path: temporal/train-* - split: test path: temporal/test-* - config_name: random data_files: - split: train path: random/train-* - split: test path: random/test-* --- # CICIDS2017 Network Intrusion Detection Dataset The [CICIDS2017](https://www.unb.ca/cic/datasets/ids-2017.html) dataset from the Canadian Institute for Cybersecurity, provided with **temporal and random splits** for fair evaluation. ## Configurations ### `temporal` (default) — Day-Based Temporal Split > **Note:** `standard` is an alias for `temporal` — both load the same data. Train on Monday-Thursday, test on Friday. The model must generalize to unseen attack types (DDoS, Botnet, PortScan). ```python from datasets import load_dataset ds = load_dataset("lacg030175/CICIDS2017", "temporal") # or "standard" # ds["train"]: 2,125,158 rows (Mon-Thu) # ds["test"]: 702,718 rows (Friday) ``` **Train attacks:** 267,771 / 2,125,158 (12.6%) **Test attacks:** 288,785 / 702,718 (41.1%) ### `random` — Stratified Random Split 80/20 stratified random split from all days combined. ```python ds = load_dataset("lacg030175/CICIDS2017", "random") # ds["train"]: 2,262,300 rows # ds["test"]: 565,576 rows ``` ## Top-20 RF Features 1. Bwd Packet Length Std 2. Destination Port 3. Packet Length Std 4. Bwd Packet Length Max 5. Avg Bwd Segment Size 6. Bwd Packet Length Mean 7. Fwd IAT Std 8. Average Packet Size 9. Packet Length Variance 10. Flow IAT Max 11. Packet Length Mean 12. Init_Win_bytes_forward 13. Idle Min 14. Idle Mean 15. Fwd IAT Max 16. Flow IAT Std 17. Flow Packets/s 18. Flow IAT Mean 19. Fwd Header Length 20. Bwd Header Length ## Attack Types | Day | Attack Types | |---|---| | Monday | Benign only | | Tuesday | FTP-Patator, SSH-Patator | | Wednesday | DoS Hulk, DoS GoldenEye, DoS Slowhttptest, DoS slowloris, Heartbleed | | Thursday | Web Attack (Brute Force, XSS, SQL Injection), Infiltration | | **Friday (test)** | **Bot, DDoS, PortScan** | ## Labels - **Binary** (`label`): 0 = BENIGN, 1 = Attack - **Multi-class** (`Label`): 15 categories (BENIGN + 14 attack types) ## Features 78 numeric flow-level features extracted by CICFlowMeter. ## Preprocessing - Removed rows with NaN/infinity values - Stripped whitespace from column names and labels - All features converted to numeric (float64) - Added binary `label` column (0=BENIGN, 1=Attack) ## Citation ```bibtex @inproceedings{sharafaldin2018toward, title={Toward Generating a New Intrusion Detection Dataset and Intrusion Traffic Characterization}, author={Sharafaldin, Iman and Lashkari, Arash Habibi and Ghorbani, Ali A}, booktitle={International Conference on Information Systems Security and Privacy}, year={2018} } ``` ## License CC BY 4.0 — original dataset by the Canadian Institute for Cybersecurity, University of New Brunswick.
提供机构:
lacg030175
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作