A Pilot Speech Corpus for Studying Device and Environmental Variability in Voice Biometrics

Figshare2025-09-03 更新2026-04-28 收录

下载链接：

https://figshare.com/articles/dataset/_b_A_Pilot_Speech_Corpus_for_Studying_Device_and_Environmental_Variability_in_Voice_Biometrics_b_/30039037

下载链接

链接失效反馈

官方服务：

资源简介：

This dataset provides a curated pilot corpus for studying device and environmental variability in voice biometrics. It contains 480 speech recordings from 12 participants (Japan, Nigeria, Ivory Coast, France, Germany, and Indonesia), each contributing 40 utterances recorded across multiple devices and environments.Recordings were made using the Samsung A04s, OnePlus Nord (both direct and in-call), iPhone 15 Pro, and a USB condenser microphone (connected to a MacBook), under both indoor (semi-controlled lobby) and outdoor (campus) conditions. All files are stored in WAV format (8–16 kHz, 16-bit PCM), accompanied by a metadata file (CSV/Excel) with anonymized attributes such as nationality, gender, age, and English proficiency.The dataset supports research in speech enhancement (spectral subtraction, Wiener filtering, adaptive filtering), speaker identification and verification, spoofing resilience, and liveness detection. Validation experiments confirmed that adaptive filtering achieved the highest accuracy (97%), highlighting both the challenges of cross-device variability and the potential for robust enhancement methods.This corpus provides a valuable benchmark for developing secure and consistent voice biometric systems, particularly in real-world applications such as mobile banking authentication and low-resource environments.

创建时间：

2025-09-03

5,000+

优质数据集

54 个

任务类型

进入经典数据集