five

IoT-SQL Dataset: A Benchmark for Text-to-SQL and IoT Threat Classification

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/15000587
下载链接
链接失效反馈
官方服务:
资源简介:
IoT-SQL Dataset: A Benchmark for Text-to-SQL and IoT Threat Classification Overview This dataset accompanies the paper: "Beyond Text-to-SQL for IoT Defense: A Comprehensive Framework for Querying and Classifying IoT Threats" Published in TrustNLP: Fifth Workshop on Trustworthy Natural Language Processing, colocated with NAACL 2025. This dataset is designed to facilitate research in: Text-to-SQL: Generating SQL queries from natural language. IoT Network Traffic Analysis: Detecting malicious activity in IoT environments. Multimodal Learning: Combining structured database queries with network security classification. Dataset Contents The dataset consists of three main components: 1. IoT Database (iot_database.sql.gz) SQL schema and data from IoT-23 logs and Smart Building Sensor datasets to be put into a database. 2. Text-to-SQL Data (text-to-SQL-data.zip) Includes queries with joins, aggregations, temporal conditions, and nested clauses. Data split into training (6,591), validation (2,197), and test (2,197) sets. 3. Network Traffic Data (network_traffic_data.zip) Each record labeled as benign or malicious. Features include timestamps, IPs, ports, protocols, byte counts, and connection history. Malicious traffic includes DDoS, C&C, and botnet-related activity. Usage Instructions Setting Up the Database Extract the database file: gunzip iot_database.sql.gz Import into MySQL: mysql -u -p < iot_database.sql Verify the schema: SHOW TABLES; Citation If you use this dataset, please cite: @inproceedings{pavlich2025beyond, author = {Ryan Pavlich and Nima Ebadi and Richard Tarbell and Billy Linares and Adrian Tan and Rachael Humphreys and Jayanta Kumar Das and Rambod Ghandiparsi and Hannah Haley and Jerris George and Rocky Slavin and Kim-Kwang Raymond Choo and Glenn Dietrich and Anthony Rios}, title = {Beyond Text-to-SQL for IoT Defense: A Comprehensive Framework for Querying and Classifying IoT Threats}, booktitle = {TrustNLP: Fifth Workshop on Trustworthy Natural Language Processing}, year = {2025}, organization = {NAACL} } Contact For questions or collaborations, contact Anthony Rios at Anthony.Rios@utsa.edu.
创建时间:
2025-03-10
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作