five

A 4-Month Dataset of SSH Botnet Interactions and Command Payloads

收藏
DataCite Commons2026-05-06 更新2026-05-07 收录
下载链接:
https://zenodo.org/doi/10.5281/zenodo.19815504
下载链接
链接失效反馈
官方服务:
资源简介:
Overview This dataset contains 145,425 security events collected by a custom multi-threaded SSH Honeypot. The data reflects real-world automated and manual attack patterns against Linux-based systems, captured over a focused 4-month observation window from July 27, 2025, to November 14, 2025. Research Context The collection was conducted as part of the research project 'High-Interaction SSH Threat Intelligence & Attack Modeling' at the National University 'Odesa Law Academy'. Revision History (v2.1 Update) Version 2.1 (April 2026): Final validated release. Logging Level Inversion: Physically updated the level column. INFO now represents transport-layer noise (94.6%), while WARNING marks active application-layer interactions (5.4%). Metadata Synchronization: All documentation and BibTeX records are updated to reflect the refined 4-month data window and final event counts. Version 2.0: Conducted thorough data sanitization, excluding 74 internal administrative sessions (localhost) and debugging logs from the initial setup phase. Version 1.0: Initial raw release. Technical Specifications Engine: Multi-threaded Python 3.10 application using the Paramiko library. Core Logic: Handles SSHv2 transport and authentication layers by subclassing paramiko.ServerInterface. Session Management: Incoming connections are encapsulated in individual threads, where each session is assigned a unique UUID for full "kill chain" reconstruction. Payload Interception: Command requests are intercepted via the check_channel_exec_request method, allowing for the capture of raw payloads (including malware droppers and fileless /dev/tcp strings) without executing them on the host system. Persistence: Data is saved to a SQLite 3 database in real-time using a synchronous write-ahead logging (WAL) approach. Key Research Findings (v2.1) Attack Intensity: Analysis shows peak intensities exceeding 10,700 interactions per hour during automated surge events. Payload Diversity: The dataset captures 28 unique interactive shell sessions, including sophisticated fileless exploitation via bash sockets. Credential Intelligence: Records 2,109 unique credential pairs, providing insights into modern automated brute-force patterns. High-Fidelity Noise Reduction: The pre-filtered level field allows researchers to immediately isolate the 5.4% of high-value attack payloads from background connection noise. Data Structure The dataset is provided in SQLite3 (.db) and CSV formats. Fields: id, timestamp, session_id, ip, port, event_type, message, command, level. Authors & Affiliation Viktor Boiko (ORCID: 0000-0001-5929-657X) — Scientific Supervisor & Lead Researcher. Oleksandr Niiakyi (ORCID: 0009-0005-1025-1617) — Software Developer & Researcher. Affiliation: Faculty of Cybersecurity and Information Technologies, National University "Odesa Law Academy". Licensing Creative Commons Attribution 4.0 International (CC BY 4.0).
提供机构:
Zenodo
创建时间:
2026-04-27
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作