Multimodal Keyboard Acoustic (MKA) Datasets

Mendeley Data2024-06-19 更新2024-06-26 收录

下载链接：

https://data.mendeley.com/datasets/bpt2hvf8n3

下载链接

链接失效反馈

官方服务：

资源简介：

Our research team from the Computer Science Department at the University of Halabja has developed an innovative dataset collection named the Multimodal Keyboard Acoustic (MKA) Datasets. The Multimodal Keyboard Acoustic (MKA) Datasets, designed to aid in keyboard sound recognition and analysis, address the critical need for defending against acoustic-based cyber threats. With the increasing sophistication of cyberattacks, focusing on keyboard acoustics is particularly timely. The MKA Datasets encompass detailed recordings from six commonly used platforms: HP, Lenovo, MSI, Mac, Messenger, and Zoom. Each platform's dataset includes raw recordings, segmented sound files, and matrices derived from these sounds, capturing the subtle variations in typing behavior across different devices and applications. We meticulously organize the MKA datasets to facilitate ease of use and thorough analysis. Each platform has a dedicated folder containing subfolders for raw data, segmented sound files, and matrices. Additionally, an aggregated folder combines data from all platforms, providing a broad spectrum for cross-platform analysis. In total, the MKA datasets consist of around 2630 files with.wav extensions for sound segments, as well as an equal number of matrix and.txt files. The number of files varies by platform, with approximately 70 files for HP, Lenovo, MSI, Zoom, and Messenger, and 61 files for Mac. Within each platform's dataset, the "Sound segments" folder stores six one-second WAV audio excerpts derived from the corresponding raw data files for each class, renamed using a convention of "class_name+1" to "class_name+6" for each platform individually and "class_name+platform_name1" to "class_name+platform_name6" for the aggregated datasets. The "Sound segment (.matrix)" folder contains feature representations, such as MFCCs, extracted from each sound segment. Additionally, the "Sound segment metadata (.txt)" folder holds detailed information for each sound segment, including recording conditions, platform information, and keystroke class labels. Beyond cybersecurity, the MKA datasets have potential applications in domains such as speech recognition and natural language processing. The datasets, which provide a diverse set of sound profiles, support the development of more robust and adaptable algorithms in these fields. The versatility of the MKA datasets makes them an invaluable tool not only for advancing cybersecurity research, but also for improving the efficiency and accuracy of human-computer interaction technologies. Through our comprehensive approach, we aim to contribute significantly to both academic research and practical applications in these interconnected areas.

创建时间：

2024-06-13

搜集汇总

数据集介绍

背景与挑战

背景概述

Multi-Keyboard Acoustic (MKA) Datasets是一个用于键盘声音识别和分析的数据集，旨在防御基于声学的网络威胁。它包含来自六个常见平台（HP、Lenovo、MSI、Mac、Messenger、Zoom）的录音数据，总计约2630个文件，涵盖原始录音、分段声音和特征矩阵，支持跨平台分析和算法开发。该数据集在网络安全、语音识别等领域具有应用潜力，有助于提升人机交互技术的效率和准确性。

以上内容由遇见数据集搜集并总结生成

5,000+

优质数据集

54 个

任务类型

进入经典数据集