five

speechbrain/LoquaciousSet

收藏
Hugging Face2026-02-11 更新2025-05-31 收录
下载链接:
https://hf-mirror.com/datasets/speechbrain/LoquaciousSet
下载链接
链接失效反馈
官方服务:
资源简介:
LargeScaleASR是一个包含25000小时英文语音识别数据的数据集,用于研究和商业用途。数据集由6个子集组成,分别是large、medium、small、clean、dev和test。large子集包含25000小时的有声读物/自发和干净/嘈杂的转录语音;medium子集包含2500小时的有声读物/自发和干净/嘈杂的转录语音;small子集包含250小时的有声读物/自发和干净/嘈杂的转录语音;clean子集包含13000小时的有声读物/自发转录语音,不包含YODA和Peoples Speech数据;dev和test子集分别包含15小时和21小时的数据。

LargeScaleASR is a dataset containing 25,000 hours of English speech recognition data for research and commercial use. The dataset consists of 6 subsets: large, medium, small, clean, dev, and test. The large subset contains 25,000 hours of read/spontaneous and clean/noisy transcribed speech; the medium subset contains 2,500 hours of read/spontaneous and clean/noisy transcribed speech; the small subset contains 250 hours of read/spontaneous and clean/noisy transcribed speech; the clean subset contains 13,000 hours of read/spontaneous transcribed speech, excluding YODA and Peoples Speech data; the dev and test subsets contain 15 hours and 21 hours of data, respectively.
提供机构:
speechbrain
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作