five

guilinhu/libri_conversation

收藏
Hugging Face2025-11-19 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/guilinhu/libri_conversation
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: mit task_categories: - audio-to-audio language: - en --- # Libri Conversation FLAC Dataset This dataset accompanies the paper: **[Proactive Hearing Assistants that Isolate Egocentric Conversations](https://www.arxiv.org/abs/2511.11473)** *Hu et al., 2025* It contains **~234 hours of conversational-style audio** derived from LibriSpeech-like sources, processed into 60-second multispeaker segments under different experimental conditions: - **libri_leaving** — scenarios where one speaker intermittently leaves the conversation - **libri_multi** — 3-speaker conversational segments All audio is stored in **FLAC** format. Metadata files (JSON) are preserved exactly as in the original directory structure. --- ## Dataset Structure The dataset is organized into two main components: libri_leaving/ train/ val/ test/ libri_multi/ train/ val/ test/ Because of Hugging Face API request limits, the dataset is packaged into `.tar` archives. Each archive mirrors the original folder structure: libri_leaving_train.tar libri_leaving_val.tar libri_leaving_test.tar libri_multi_train.tar libri_multi_val.tar libri_multi_test.tar --- ## Citation If you use this dataset, please cite the following paper: @inproceedings{hu2025proactive, title={Proactive Hearing Assistants that Isolate Egocentric Conversations}, author={Hu, Guilin and Itani, Malek and Chen, Tuochao and Gollakota, Shyamnath}, booktitle={Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing}, pages={25377--25394}, year={2025} }

许可证: MIT 任务类别: - 音频到音频 语言: - 英语 # Libri Conversation FLAC 数据集 本数据集配套论文: **[《分离自我中心对话的主动听力助手》](https://www.arxiv.org/abs/2511.11473)** *胡等人,2025年* 本数据集包含约234小时的会话类音频,数据源自类LibriSpeech数据源,经处理后生成不同实验条件下的60秒多说话人片段: - **libri_leaving** — 单说话人间歇性退出对话的场景 - **libri_multi** — 三说话人会话片段 所有音频均存储为**FLAC格式**。元数据文件(JSON格式)严格遵循原始目录结构完整留存。 --- ## 数据集结构 本数据集分为两大核心组件: libri_leaving/ train/ val/ test/ libri_multi/ train/ val/ test/ 由于Hugging Face API存在请求限制,本数据集被打包为.tar归档文件。每个归档文件均镜像原始文件夹结构: libri_leaving_train.tar libri_leaving_val.tar libri_leaving_test.tar libri_multi_train.tar libri_multi_val.tar libri_multi_test.tar --- ## 引用说明 若您使用本数据集,请引用以下论文: @inproceedings{hu2025proactive, title={Proactive Hearing Assistants that Isolate Egocentric Conversations}, author={Hu, Guilin and Itani, Malek and Chen, Tuochao and Gollakota, Shyamnath}, booktitle={Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing}, pages={25377--25394}, year={2025} }
提供机构:
guilinhu
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作