speechbrain/LoquaciousSet
收藏Hugging Face2026-02-11 更新2025-05-31 收录
下载链接:
https://hf-mirror.com/datasets/speechbrain/LoquaciousSet
下载链接
链接失效反馈官方服务:
资源简介:
LargeScaleASR是一个包含25000小时英文语音识别数据的数据集,用于研究和商业用途。数据集由6个子集组成,分别是large、medium、small、clean、dev和test。large子集包含25000小时的有声读物/自发和干净/嘈杂的转录语音;medium子集包含2500小时的有声读物/自发和干净/嘈杂的转录语音;small子集包含250小时的有声读物/自发和干净/嘈杂的转录语音;clean子集包含13000小时的有声读物/自发转录语音,不包含YODA和Peoples Speech数据;dev和test子集分别包含15小时和21小时的数据。
LargeScaleASR is a dataset containing 25,000 hours of English speech recognition data for research and commercial use. The dataset consists of 6 subsets: large, medium, small, clean, dev, and test. The large subset contains 25,000 hours of read/spontaneous and clean/noisy transcribed speech; the medium subset contains 2,500 hours of read/spontaneous and clean/noisy transcribed speech; the small subset contains 250 hours of read/spontaneous and clean/noisy transcribed speech; the clean subset contains 13,000 hours of read/spontaneous transcribed speech, excluding YODA and Peoples Speech data; the dev and test subsets contain 15 hours and 21 hours of data, respectively.
提供机构:
speechbrain



