A Multi-Condition Replay and Channel-Based Speech Spoofing Dataset Across 15 Bangladeshi Regions with Regional Accent Variability

DataONE2026-01-28 更新2026-02-07 收录

下载链接：

https://search.dataone.org/view/sha256:f1abdbef56e0315d747728bb6e34241dbc9c0b65741b342736afe73962f9bb8e

下载链接

链接失效反馈

官方服务：

资源简介：

This dataset contains a large-scale speech spoofing corpus collected from speakers across 14 districts and the Sandwip Upazila of Bangladesh, capturing regional accent variability. The dataset includes both genuine and spoofed speech recorded under five real-world attack conditions: Monster, Monster–Telephone, Telephone, Robot, and Radio. Spoofed samples were generated through physical replay devices and communication channels, preserving realistic channel- and device-induced distortions. The audio files are organized into multiple subsets, including raw recordings, class-wise real vs. spoof partitions, and predefined training, validation, and testing splits. All recordings underwent quality control and preprocessing steps, including resampling, silence removal, and amplitude normalization. Due to file size constraints, the dataset is released as multiple compressed archives, each corresponding to specific spoofing scenarios and dataset splits. This dataset is intended to support research on speech anti-spoofing, replay attack detection, channel-aware speaker verification, and robustness evaluation of deep learning–based voice biometric systems.

创建时间：

2026-01-30