five

ASR-RAMC-BIGCCSC: A CHINESE CONVERSATIONAL SPEECH CORPUS

收藏
OpenDataLab2026-05-24 更新2024-05-09 收录
下载链接:
https://opendatalab.org.cn/OpenDataLab/ASR-RAMC-BIGCCSC-_A_CHINESE_CONVERSATIONAL_etc
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集包括351组多轮普通话对话,每组对话的标注信息包括转录文本、语音活动时间戳、说话人信息、录制信息和话题信息。说话人信息包括了性别、年龄和地域,录制信息包括了环境和设备。 数据均于室内采集。声学环境为不足20平米的房间,混响时间(RT60)小于0.4秒。环境噪音水平低于40dB(A),录制过程中环境相对安静。所有录制的参与者都是以中文为母语的流利普通话使用者,他们的口音略有不同。 音频由Magic Data开发的手机应用程序录制,录制使用的手机均是主流的智能手机,其中安卓和IOS系统的使用比例约1:1。音频文件为16比特采样点,采样率为16KHz,录音质量比同类型的对话语音语料库(如HKUST/MTS、SwitchBoard、Fisher)更高。 转录文本由Magic Data人工标注并由专业检验员校对。MagicData-RAMC标注信息非常丰富,在语音内容转写结果的基础上,还标注了非语言信息,包括笑声、音乐声、噪声等。口语对话中常见的犹豫、重复等语言不流畅的现象也被标注出来。对话中每个说话人的起始时间戳也被标注出来,可以用于说话人日志相关的研究。

This dataset comprises 351 multi-turn Mandarin conversation sessions. Annotation information for each conversation includes transcribed text, voice activity timestamps, speaker details, recording metadata, and topic information. Speaker details cover gender, age, and regional origin, while recording metadata includes environmental conditions and recording equipment used. All data was collected indoors. The acoustic environment consists of rooms with an area of less than 20 square meters, with a reverberation time (RT60) of less than 0.4 seconds. The ambient noise level is below 40 dB(A), and the recording environment remained relatively quiet throughout. All participating speakers are fluent native Mandarin speakers with slightly varying accents. The audio was recorded using a mobile application developed by Magic Data. All recording devices are mainstream smartphones, with an approximate 1:1 usage ratio of Android and iOS operating systems. The audio files use 16-bit sampling and a sampling rate of 16 kHz, and the recording quality outperforms that of similar conversational speech corpora such as HKUST/MTS, SwitchBoard, and Fisher. The transcribed text was manually annotated by Magic Data and proofread by professional reviewers. The MagicData-RAMC annotation set is highly comprehensive: in addition to the speech content transcription results, it also annotates non-verbal information including laughter, music, ambient noise, and other similar signals. Common disfluent phenomena in spoken conversations, such as hesitations and repetitions, are also annotated. The start timestamps for each speaker in the conversation are additionally labeled, which can be utilized for research related to speaker logging.
提供机构:
OpenDataLab
创建时间:
2023-01-12
搜集汇总
数据集介绍
main_image_url
背景与挑战
背景概述
ASR-RAMC-BIGCCSC是一个包含351组多轮普通话对话的中文会话语音语料库,标注信息丰富,包括转录文本、语音活动时间戳、说话人信息等。数据在室内采集,录音质量高,适用于语音识别和说话人日志等研究。
以上内容由遇见数据集搜集并总结生成
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作