Hybrid Forgery Audio Dataset
收藏DataCite Commons2026-04-03 更新2026-05-05 收录
下载链接:
https://www.scidb.cn/detail?dataSetId=549cc42afeba4e11b7860511e56536fc
下载链接
链接失效反馈官方服务:
资源简介:
The hybrid forgery dataset contains two types of samples: traditional editing tampering and partial deepfake. The editing tampering portion is constructed based on the AISHELL-3 speech corpus, comprising speech samples from 174 different speakers. This dataset is built using the professional audio editing software CoolEdit Pro through manual cutting and splicing. The specific tampering forms include segment replacement and insertion. The length of all tampered segments is controlled between 0.1 seconds and 3 seconds, covering various tampering scenarios ranging from extremely short transient anomalies to longer semantically inconsistent segments. The final generated audio samples have a total length ranging from 3 seconds to 8 seconds, consisting of 4,341 training samples, 1,600 validation samples, and 3,248 test samples. All samples have been manually verified and boundary-annotated.The deepfake samples are generated using speech synthesis technology based on Global Style Tokens (GST), simulating the forgery traces of algorithms such as speech synthesis and voice conversion. The length of both types of forged segments is strictly controlled within the range of 0.1 seconds to 3 seconds, and they are randomly spliced with real audio. The final total audio length is also constrained between 3 seconds and 8 seconds. This dataset comprises 5,150 training samples, 2,000 validation samples, and 4,800 test samples, covering hybrid scenarios with multiple samples and multiple forgery types from the same set of 174 speakers.
提供机构:
Science Data Bank
创建时间:
2026-04-03



