CHiME-6

Name: CHiME-6
Creator: OpenDataLab
Published: 2026-05-17 11:30:44
License: 暂无描述

OpenDataLab2026-05-17 更新2024-05-09 收录

下载链接：

https://opendatalab.org.cn/OpenDataLab/CHiME-6

下载链接

链接失效反馈

官方服务：

资源简介：

继第一届、第二届、第三届、第四届和第五届CHiME取得成功之后我们组织了第六届CHiME Speech Separation 和挑战认可挑战（CHiME-6）。新挑战重温之前的CHiME-5挑战赛进一步考虑了日常家庭环境中远程多麦克风会话语音分类和识别的问题。演讲材料与之前的 CHiME-5 录音相同，除了用于精确的阵列同步。材料被引出使用晚宴场景并努力捕获数据这是自然会话语音的代表。本文提供了 CHiME-6 挑战的基线描述适用于分段多说话人语音识别（轨道 1）和未分段的多说话者语音识别（轨道 2）。的注意，Track 2是社区第一个挑战活动使用一整套可重现的开源基线来处理未分段的多说话者语音识别场景提供语音增强、说话人分类和语音识别模块。

Following the success of the 1st, 2nd, 3rd, 4th, and 5th CHiME challenges, we organized the 6th CHiME Speech Separation and Recognition Challenge (CHiME-6). This new challenge revisits the task of remote multi-microphone conversational speech classification and recognition in everyday home environments, building on the setup of the previous CHiME-5 challenge. The speech corpus is identical to the recordings from the prior CHiME-5 dataset, with the sole modification being the use of precise array synchronization. The corpus was collected using a dinner party scenario, with efforts made to capture data that is representative of natural conversational speech. This paper provides a baseline description for the CHiME-6 challenge, covering two tracks: Track 1 for segmented multi-speaker speech recognition, and Track 2 for unsegmented multi-speaker speech recognition. Notably, Track 2 is the first community challenge event to offer a full set of reproducible open-source baselines for the unsegmented multi-speaker speech recognition scenario, integrating speech enhancement, speaker classification, and speech recognition modules.

提供机构：

OpenDataLab

创建时间：

2023-06-25

搜集汇总

数据集介绍

背景与挑战

背景概述

CHiME-6是第六届CHiME挑战赛的数据集，专注于日常家庭环境中的远程多麦克风会话语音处理，特别是多说话人语音识别，包括分段和未分段两种场景。它基于CHiME-5的录音，但改进了阵列同步，旨在捕获自然会话语音的代表性数据，并提供可重现的开源基线。该数据集由约翰霍普金斯大学等机构于2018年发布，适用于语音增强、说话人分类和语音识别研究。

以上内容由遇见数据集搜集并总结生成

5,000+

优质数据集

54 个

任务类型

进入经典数据集