five

gedeonmate/LibriConvo-segmented

收藏
Hugging Face2025-10-30 更新2025-11-15 收录
下载链接:
https://hf-mirror.com/datasets/gedeonmate/LibriConvo-segmented
下载链接
链接失效反馈
官方服务:
资源简介:
LibriConvo-Segmented是一个从LibriConvo语料库分割出来的版本,LibriConvo是一个使用Speaker-Aware Conversation Simulation (SASC)构建的模拟双说话人对话数据集。它旨在用于训练和评估多说话人语音处理系统,包括说话人分割、自动语音识别(ASR)和重叠语音建模。这个分割版本提供了来自完整LibriConvo对话的≤30秒对话片段,其中40%应用了房间脉冲响应。整个LibriConvo语料库共有240.1小时的1496个对话,由830个不同说话人组成。这个分割版本提供了更短、自包含的音频片段,适合用于ASR和说话人分割模型的微调。

LibriConvo-Segmented is a segmented version of the LibriConvo corpus — a simulated two-speaker conversational dataset built using Speaker-Aware Conversation Simulation (SASC). It is designed for training and evaluation of multi-speaker speech processing systems, including speaker diarization, automatic speech recognition (ASR), and overlapping speech modeling. This segmented version provides ≤30-second conversational fragments derived from full LibriConvo dialogues, with 40% of them having room impulse responses applied on them. The full LibriConvo corpus comprises 240.1 hours across 1,496 dialogues with 830 unique speakers. This segmented release provides shorter, self-contained audio clips suitable for fine-tuning ASR and diarization models.
提供机构:
gedeonmate
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作