Multi-Type Tampered Audio Dataset

Name: Multi-Type Tampered Audio Dataset
Creator: Science Data Bank
Published: 2026-04-03 08:10:51
License: 暂无描述

DataCite Commons2026-04-03 更新2026-05-05 收录

下载链接：

https://www.scidb.cn/detail?dataSetId=050a13dfc4ee43efb526bea2ff450a07

下载链接

链接失效反馈

官方服务：

资源简介：

The Multi Type Tampered Audio Dataset (MTAD) is a standardized dataset built for general audio tampering detection tasks. The dataset is strictly based on real speech corpora such as THCHS-30, AISHELL, Librispeech, Common Voice, CHiME, ASVspoof 2019 LA, and Crowdsourced high-quality Chilean Spanish speech. The original audio is uniformly in WAV format, with a sampling rate of 16kHz, a sampling depth of 16 bits, and a processing time of 4 seconds. The dataset contains four basic types of tampering: deletion, copy paste, same source splicing, and different source splicing. Untreated samples are obtained by cropping, deleted samples delete some audio at random starting points, copy paste samples copy 0.1-0.2 seconds of audio segments for random pasting, and same source splicing and different source splicing respectively concatenate audio from the same or different datasets. Each type of sample is equipped with corresponding mask annotations. The dataset is divided into three subsets based on the testing scenario: the Chinese English universal tampering dataset Dataset1, the multi scenario extended dataset Dataset2, and the plagiarism dataset Dataset3. D1 covers a normal recording studio environment in both Chinese and English; Dataset2 extends to real environments, cross language, and authenticity stitching scenarios, where the real environment uses the CHiME dataset to simulate complex acoustic conditions, the cross language scenario uses Spanish data, and the authenticity stitching scenario concatenates fake audio from ASVspoof 2019 LA with real audio; Dataset3 uses PyRoomAcoustics library to build an indoor acoustic simulation environment for plagiarism attack environment, and implements room impulse response simulation based on mirror sound source method, focusing on three types of tampering: deletion, homologous splicing, and heterologous splicing. The dataset contains over 20000 entries in both Chinese and English languages, with over 5000 entries of various tampering types, providing data support for audio tampering detection research.

提供机构：

Science Data Bank

创建时间：

2026-04-03

5,000+

优质数据集

54 个

任务类型

进入经典数据集