HAC-MVC: A Multimodal Dataset of Human, AI-Generated, and Cloned Malicious Voice Commands for Deepfake Detection and Neural Audio Fingerprinting

NIAID Data Ecosystem2026-05-10 收录

下载链接：

https://data.mendeley.com/datasets/5wcgbtfgvv

下载链接

链接失效反馈

官方服务：

资源简介：

Voice-activated (VA) systems are increasingly deployed in smartphones, smart homes, and enterprise environments, enabling convenient human–machine interaction through speech commands. However, recent advances in voice cloning and AI-based speech synthesis technologies have introduced significant security risks, allowing attackers to generate malicious voice commands that may bypass traditional authentication mechanisms. To facilitate research in voice spoofing detection and secure voice-based interaction, this dataset introduces HAC-MVC, a multimodal audio dataset consisting of human, cloned, and fully synthetic malicious voice commands. The dataset is designed to support the development and evaluation of deep learning models for deepfake detection, neural audio fingerprinting, and voice command security analysis. The dataset contains 4,500 audio recordings divided into three categories: human_original, human_cloned, and AI_synthetic. Authentic speech samples were recorded from 30 participants (15 male and 15 female), each speaking approximately 50 malicious command phrases such as “Turn off Windows Defender” and “Erase command history.” Recordings were captured using iPhone 8+ and Samsung Galaxy A13 smartphones in a controlled indoor environment to ensure consistent acoustic quality. To simulate modern voice spoofing attacks, the original recordings were cloned using Chatterbox AI, while fully synthetic samples were generated using Voice Maker, Free Text to Speech Converter, and AI Lab text-to-speech platforms. This multi-source generation approach introduces variability in synthetic speech characteristics, making the dataset suitable for robust deepfake detection research. The HAC-MVC dataset provides a structured benchmark for evaluating machine learning and deep learning methods across the entire pipeline—from feature extraction and representation learning to classification and threat detection. It is particularly useful for research in voice biometric authentication, adversarial audio detection, cybersecurity systems, and AI-driven speech forensics.

创建时间：

2026-03-11