five

语音识别测试数据集

收藏
国家基础学科公共科学数据中心2026-01-30 收录
下载链接:
https://nbsdc.cn/general/dataDetail?id=67d50cc9195d260905af951b&type=1
下载链接
链接失效反馈
官方服务:
资源简介:
面向终身学习教学场景中的语音识别研究,基于讯飞听见系统服务器产生,通过采集现网授课场景真实数据及互联网直播课场景,主要包括包括室内安静场景(信噪比高于25dB)和嘈杂情况(信噪比5-25dB)两种类型,测试集合时长分别为2.37H、1.8H,数据量480MB。该数据集支撑了专利《一种语音生成方法、装置、设备及存储介质》和软著《小飞大屏调度语音助手软件》。 本数据集未来可应用于以下方向:首先可作为基准测试集用于评估不同语音识别算法在复杂教学场景中的鲁棒性,特别是在多设备采集、师生交互语音重叠等实际场景下的性能表现;其次可为自适应声学模型训练提供数据支撑,推动面向教育领域的个性化语音识别系统开发;此外还可服务于教育质量分析、课堂内容结构化等智慧教育应用场景。 该数据集的价值体现在三个方面:其一,补充了终身学习场景下真实教学语音数据的短缺,其包含的噪声干扰、远场采集等特征对推动教育领域语音技术实用化具有关键意义;其二,通过提供覆盖安静/嘈杂双场景的标准化测试集,为算法性能对比建立了可靠基准,有助于推动语音识别技术在教学场景中的落地应用;其三,数据集采集自真实教学环境,其多样性特征(包括教师口音、学科术语、课堂互动等)对提升教育类语音识别系统的泛化能力具有重要研究价值。

This dataset is developed for speech recognition research in lifelong learning teaching scenarios, generated using the servers of the iFLYTEK Hear system. It collects real data from real-world classroom teaching scenarios and online live streaming course scenarios, mainly covering two types of scenarios: indoor quiet environments (signal-to-noise ratio (SNR) > 25 dB) and noisy environments (SNR 5-25 dB). The durations of the two test sets are 2.37 hours and 1.8 hours respectively, with a total data size of 480 MB. This dataset supports the patent titled "A Speech Generation Method, Apparatus, Device and Storage Medium" and the software copyright titled "Xiaofei Large Screen Scheduling Voice Assistant Software". This dataset can be applied in the following fields in the future: Firstly, it can serve as a benchmark test set to evaluate the robustness of different speech recognition algorithms in complex teaching scenarios, particularly their performance in practical scenarios such as multi-device audio collection and overlapping speech during teacher-student interactions; Secondly, it can provide data support for adaptive acoustic model training, promoting the development of personalized speech recognition systems for the education sector; In addition, it can also support intelligent education application scenarios such as education quality analysis and classroom content structuring. The value of this dataset is reflected in three aspects: First, it addresses the shortage of real-world teaching speech data in lifelong learning scenarios, and its features such as noise interference and far-field audio collection are of critical significance for promoting the practical application of speech technology in the education field; Second, by providing a standardized test set covering both quiet and noisy scenarios, it establishes a reliable benchmark for algorithm performance comparison, which facilitates the practical deployment and application of speech recognition technology in teaching scenarios; Third, the dataset is collected from real teaching environments, and its diverse characteristics (including teacher accents, subject-specific terminology, classroom interactions, etc.) have important research value for improving the generalization ability of education-oriented speech recognition systems.
提供机构:
科大讯飞股份有限公司
搜集汇总
数据集介绍
main_image_url
背景与挑战
背景概述
该数据集是由科大讯飞股份有限公司创建的语音识别测试数据集,专注于终身学习教学场景,包含室内安静和嘈杂两种环境下的真实授课语音,总时长约4.17小时,数据量480MB。它主要用于评估语音识别算法在教学场景中的鲁棒性,支撑自适应声学模型训练和智慧教育应用,具有填补教育领域语音数据短缺、建立标准化测试基准的价值。
以上内容由遇见数据集搜集并总结生成
二维码
社区交流群
二维码
科研交流群
商业服务