A Dataset for Voice-Based Human Identity Recognition

NIAID Data Ecosystem2026-03-13 收录

下载链接：

https://data.mendeley.com/datasets/zw4p4p7sdh

下载链接

链接失效反馈

官方服务：

资源简介：

This dataset is divided into two main sub-datasets: samePhrase and differentPhrase. Each speaker has the same label in both sub-datasets. In the samePhrase sub-dataset, a speaker repeats the sentence “Machine Learning 1, 2, 3, 4, 5, 6, 7, 8, 9, 10” ten times. The length of each sample is between seven and ten seconds. For the differentPhrase sub-dataset, each speaker contributed with a phrase selected randomly from different resources such as books, songs lyrics, orone-line texts. Each speaker contributed with ten different samples, the length of each sample inthe differentPhrase sub-dataset does not exceed ten seconds

本数据集分为两个主要子数据集：samePhrase（相同短语）子数据集与differentPhrase（不同短语）子数据集。每位说话者在两个子数据集中均对应相同的标签。在samePhrase子数据集中，每位说话者需将句子"机器学习1、2、3、4、5、6、7、8、9、10"重复朗读十次，单条样本的时长介于7秒至10秒之间。在differentPhrase子数据集中，每位说话者从书籍、歌词或单行文本等不同来源中随机选取短语进行朗读，每位说话者提供10条不同样本，该子数据集内的单条样本时长均不超过10秒。

创建时间：

2022-03-31

5,000+

优质数据集

54 个

任务类型

进入经典数据集