OpenDialog
收藏魔搭社区2026-05-15 更新2025-07-19 收录
下载链接:
https://modelscope.cn/datasets/k2-fsa/OpenDialog
下载链接
链接失效反馈官方服务:
资源简介:
# OpenDialog
OpenDialog is a 6.8k hours spoken dialogue dataset, introduced in the paper [ZipVoice-Dialog: Non-Autoregressive Spoken Dialogue Generation with Flow Matching](https://arxiv.org/abs/2507.09318).
- **Paper:** [https://arxiv.org/abs/2507.09318](https://arxiv.org/abs/2507.09318)
- **GitHub:** [https://github.com/k2-fsa/ZipVoice](https://github.com/k2-fsa/ZipVoice)
- **Project Page:** [https://zipvoice-dialog.github.io](https://zipvoice-dialog.github.io)
OpenDialog is the first large-scale (6.8k hours) open-source spoken dialogue dataset derived from in-the-wild speech data. It consists of:
- **English data:** 5074 hours
- **Chinese data:** 1759 hours
This dataset is also available at [ModelScope](https://www.modelscope.cn/datasets/k2-fsa/OpenDialog) (more friendly for users from China mainland).
## Citation
```bibtex
@article{zhu2025zipvoicedialog,
title={ZipVoice-Dialog: Non-Autoregressive Spoken Dialogue Generation with Flow Matching},
author={Zhu, Han and Kang, Wei and Guo, Liyong and Yao, Zengwei and Kuang, Fangjun and Zhuang, Weiji and Li, Zhaoqing and Han, Zhifeng and Zhang, Dong and Zhang, Xin and Song, Xingchen and Lin, Long and Povey, Daniel},
journal={arXiv preprint arXiv:2507.09318},
year={2025}
}
```
# OpenDialog
OpenDialog 是一款总时长达6.8千小时的口语对话数据集,相关研究成果刊载于论文《ZipVoice-Dialog》(https://arxiv.org/abs/2507.09318),详细信息请参阅该论文。
该数据集包含1759小时的中文语料与5074小时的英文语料。
提供机构:
maas
创建时间:
2025-07-16



