five

lacg030175/CIC-IoT-2023-full

收藏
Hugging Face2026-04-07 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/lacg030175/CIC-IoT-2023-full
下载链接
链接失效反馈
官方服务:
资源简介:
--- language: - en license: cc-by-4.0 size_categories: - 10M<n<100M task_categories: - tabular-classification tags: - network-intrusion-detection - cybersecurity - CIC-IoT-2023 - IoT - IDS - binary-classification pretty_name: CIC-IoT-2023 Full (46M) IoT Intrusion Detection configs: - config_name: random data_files: - split: train path: random/train-* - split: test path: random/test-* - config_name: random_3way data_files: - split: train path: random_3way/train-* - split: test path: random_3way/test-* - split: validation path: random_3way/validation-* default: true --- # CIC-IoT-2023 Full Dataset (46M+ rows) The FULL [CICIoT2023](https://www.unb.ca/cic/datasets/iotdataset-2023.html) dataset — all 38,508,041 rows, no subsampling. ## Configuration ### `random_3way` (default) — 80/10/10 Three-Way Split Stratified random split with fully separated sets: - **Train (80%)**: 30,806,432 rows — model training and architecture search - **Test (10%)**: 3,850,804 rows — threshold calibration (held out from training) - **Validation (10%)**: 3,850,805 rows — final reported metrics (never touched) ```python from datasets import load_dataset ds = load_dataset("lacg030175/CIC-IoT-2023-full", "random_3way") ``` ## Class Distribution | Class | Count | % | |---|---:|---:| | Benign | 1,098,126 | 2.9% | | Attack | 37,409,915 | 97.1% | ## Comparison with Subsampled Version | Version | Rows | Benign % | Purpose | |---|---:|---:|---| | `lacg030175/CIC-IoT-2023` | ~1.3M | ~15% | Fast experiments (112 runs) | | **This dataset** | **38,508,041** | **2.9%** | **Literature comparison** | ## Top-20 RF Features 1. Number 2. ack_flag_number 3. Header_Length 4. HTTPS 5. Time_To_Live 6. psh_flag_number 7. Rate 8. IAT 9. Max 10. ack_count 11. Tot sum 12. Variance 13. Tot size 14. Std 15. Min 16. AVG 17. syn_count 18. syn_flag_number 19. HTTP 20. fin_count ## Attack Types (7 classes, 33 sub-types) | Class | Sub-types | |---|---| | Benign | BenignTraffic | | BruteForce | DictionaryBruteForce | | DDoS | 12 sub-types | | DoS | 4 sub-types | | Mirai | 3 sub-types | | Recon | 5 sub-types | | Spoofing | 2 sub-types | | Web-based | 6 sub-types | ## Labels - **Binary** (`label`): 0 = Benign, 1 = Attack - **Multi-class** (`Label`): 34 categories - **Grouped** (`attack_class`): 8 classes ## Citation ```bibtex @article{neto2023ciciot, title={CICIoT2023: A Real-Time Dataset and Benchmark for Large-Scale Attacks in IoT Environment}, author={Neto, Euclides Carlos Pinto and others}, journal={Sensors}, volume={23}, number={13}, year={2023}, publisher={MDPI} } ``` ## License CC BY 4.0 — original dataset by the Canadian Institute for Cybersecurity, University of New Brunswick.
提供机构:
lacg030175
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作