guilinhu/libri_conversation
收藏Hugging Face2025-11-19 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/guilinhu/libri_conversation
下载链接
链接失效反馈官方服务:
资源简介:
---
license: mit
task_categories:
- audio-to-audio
language:
- en
---
# Libri Conversation FLAC Dataset
This dataset accompanies the paper:
**[Proactive Hearing Assistants that Isolate Egocentric Conversations](https://www.arxiv.org/abs/2511.11473)**
*Hu et al., 2025*
It contains **~234 hours of conversational-style audio** derived from LibriSpeech-like sources, processed into 60-second multispeaker segments under different experimental conditions:
- **libri_leaving** — scenarios where one speaker intermittently leaves the conversation
- **libri_multi** — 3-speaker conversational segments
All audio is stored in **FLAC** format. Metadata files (JSON) are preserved exactly as in the original directory structure.
---
## Dataset Structure
The dataset is organized into two main components:
libri_leaving/
train/
val/
test/
libri_multi/
train/
val/
test/
Because of Hugging Face API request limits, the dataset is packaged into `.tar` archives.
Each archive mirrors the original folder structure:
libri_leaving_train.tar
libri_leaving_val.tar
libri_leaving_test.tar
libri_multi_train.tar
libri_multi_val.tar
libri_multi_test.tar
---
## Citation
If you use this dataset, please cite the following paper:
@inproceedings{hu2025proactive,
title={Proactive Hearing Assistants that Isolate Egocentric Conversations},
author={Hu, Guilin and Itani, Malek and Chen, Tuochao and Gollakota, Shyamnath},
booktitle={Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing},
pages={25377--25394},
year={2025}
}
许可证: MIT
任务类别:
- 音频到音频
语言:
- 英语
# Libri Conversation FLAC 数据集
本数据集配套论文:
**[《分离自我中心对话的主动听力助手》](https://www.arxiv.org/abs/2511.11473)**
*胡等人,2025年*
本数据集包含约234小时的会话类音频,数据源自类LibriSpeech数据源,经处理后生成不同实验条件下的60秒多说话人片段:
- **libri_leaving** — 单说话人间歇性退出对话的场景
- **libri_multi** — 三说话人会话片段
所有音频均存储为**FLAC格式**。元数据文件(JSON格式)严格遵循原始目录结构完整留存。
---
## 数据集结构
本数据集分为两大核心组件:
libri_leaving/
train/
val/
test/
libri_multi/
train/
val/
test/
由于Hugging Face API存在请求限制,本数据集被打包为.tar归档文件。每个归档文件均镜像原始文件夹结构:
libri_leaving_train.tar
libri_leaving_val.tar
libri_leaving_test.tar
libri_multi_train.tar
libri_multi_val.tar
libri_multi_test.tar
---
## 引用说明
若您使用本数据集,请引用以下论文:
@inproceedings{hu2025proactive,
title={Proactive Hearing Assistants that Isolate Egocentric Conversations},
author={Hu, Guilin and Itani, Malek and Chen, Tuochao and Gollakota, Shyamnath},
booktitle={Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing},
pages={25377--25394},
year={2025}
}
提供机构:
guilinhu



