ggfox00000/dia-Notsofar-test
收藏Hugging Face2026-04-28 更新2026-05-03 收录
下载链接:
https://hf-mirror.com/datasets/ggfox00000/dia-Notsofar-test
下载链接
链接失效反馈官方服务:
资源简介:
NOTSOFAR-1是一个用于远场多说话人会议转录的联合说话人识别(DIA)和语音转文本(STT)的基准测试数据集。数据集包含自然会议场景中的多说话人录音,既有远场麦克风阵列录音,也有近讲麦克风录音。数据集提供了三个评估分割,分别是无真实标签的小型评估集、带真实标签的小型评估集和带真实标签的完整评估集,总计7,768个文件,约84.5 GB。数据集的结构保留了原始仓库中的目录组织方式,每个会议录音包含设备信息、近讲麦克风录音、远场麦克风阵列录音和转录文本(如果包含真实标签)。参考指标包括DER(说话人识别错误率)、tcpWER(时间约束的排列不变词错误率)和SA-WER(说话人感知的词错误率)。
NOTSOFAR-1 is a benchmark for joint speaker identification (DIA) and speech-to-text (STT) in far-field multi-speaker meeting transcription. The dataset includes recordings of natural meetings with multiple speakers, featuring both far-field microphone arrays and close-talk microphones. It provides three evaluation splits: a small evaluation set without ground truth (GT), a small evaluation set with GT, and a full evaluation set with GT, totaling 7,768 files and approximately 84.5 GB. The dataset structure preserves the original directory organization from the upstream repository, with each meeting recording containing device information, close-talk microphone recordings, far-field microphone array recordings, and transcriptions (if GT is available). Reference metrics include DER (Diarization Error Rate), tcpWER (time-constrained permutation-invariant word error rate), and SA-WER (speaker-attributed word error rate).
提供机构:
ggfox00000



