WaveFake
收藏OpenDataLab2026-05-17 更新2024-05-09 收录
下载链接:
https://opendatalab.org.cn/OpenDataLab/WaveFake
下载链接
链接失效反馈官方服务:
资源简介:
该数据集由 104,885 个生成的音频剪辑(16 位 PCM wav)组成。我们检查了在两个参考数据集上训练的多个网络。首先,LJSpeech 数据集包含 13,100 个短音频片段(平均每个 6 秒;总共大约 24 小时),由一位女性说话者朗读。它包含来自 7 部非小说类书籍的段落,音频录制在 MacBook Pro 麦克风上。其次,我们包括基于 JSUT 数据集的样本,特别是 basic5000 语料库。该语料库由 5,000 个句子组成,涵盖了日语的所有基本汉字(平均 4.8 秒;总共大约 6.7 小时)。录音是由一位以日语为母语的女性在无回声的房间里录制的。最后,我们包括来自完整文本到语音管道的样本(16,283 个短语;平均 3.8 秒;总共大约 17.5 小时)。因此,我们的数据集总共包含大约 175 小时的生成音频文件。请注意,我们不会重新分配参考数据。
This dataset consists of 104,885 generated audio clips in 16-bit PCM WAV format. We examined multiple networks trained on two reference datasets. First, the LJSpeech dataset contains 13,100 short audio segments with an average duration of 6 seconds and a total of approximately 24 hours, read by a single female speaker. It includes passages from seven non-fiction books, with audio recorded using a MacBook Pro microphone. Second, we included samples based on the JSUT dataset, specifically the basic5000 corpus. This corpus is composed of 5,000 sentences covering all basic Japanese kanji, with an average duration of 4.8 seconds and a total of approximately 6.7 hours. The recordings were made by a native Japanese female speaker in an anechoic chamber. Finally, we included samples from a complete text-to-speech pipeline: 16,283 phrases with an average duration of 3.8 seconds and a total of approximately 17.5 hours. Thus, our dataset contains a total of approximately 175 hours of generated audio files. Please note that we do not redistribute the reference data.
提供机构:
OpenDataLab
创建时间:
2022-09-01
搜集汇总
数据集介绍

背景与挑战
背景概述
WaveFake是一个用于音频深度伪造检测的数据集,包含约10.5万个生成的音频剪辑,总时长约175小时,来源于英语和日语的女性说话者录音。该数据集支持音频预训练和DeepFake检测研究,由波鸿鲁尔大学于2021年发布,遵循CC BY-SA 4.0许可协议。
以上内容由遇见数据集搜集并总结生成



