five

ylacombe/mls-annotated

收藏
Hugging Face2024-11-05 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/ylacombe/mls-annotated
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集包含多种语言的音频和文本数据,涵盖了荷兰语、法语、德语、意大利语、波兰语、葡萄牙语和西班牙语。每个语言的配置包含多个特征,如音频路径、开始时间、结束时间、文本、音频时长、说话者ID、章节ID、文件、ID、音高均值、音高标准差、信噪比、C50、语速、音素、STOI、SI-SDR、PESQ、原始文本、性别、音高、噪声、混响、语音单调性、噪声SDR、语音质量PESQ、文本描述、非大写文本等。数据集还提供了多个分割,如开发集、测试集、训练集、9小时集和1小时集,每个分割都有对应的字节数和样本数。此外,还提供了每个配置的下载大小和数据集大小。

This dataset contains audio and text data in multiple languages, including Dutch, French, German, Italian, Polish, Portuguese, and Spanish. Each language configuration includes multiple features such as audio path, start time, end time, text, audio duration, speaker ID, chapter ID, file, ID, utterance pitch mean, utterance pitch std, SNR, C50, speaking rate, phonemes, STOI, SI-SDR, PESQ, original text, gender, pitch, noise, reverberation, speech monotony, noise SDR, speech quality PESQ, text description, and non-capitalized text. The dataset also provides multiple splits such as dev, test, train, 9 hours, and 1 hour, with corresponding byte sizes and number of examples for each split. Additionally, the download size and dataset size for each configuration are provided.
提供机构:
ylacombe
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作