five

OpenDialog

收藏
魔搭社区2026-05-15 更新2025-07-19 收录
下载链接:
https://modelscope.cn/datasets/k2-fsa/OpenDialog
下载链接
链接失效反馈
官方服务:
资源简介:
# OpenDialog OpenDialog is a 6.8k hours spoken dialogue dataset, introduced in the paper [ZipVoice-Dialog: Non-Autoregressive Spoken Dialogue Generation with Flow Matching](https://arxiv.org/abs/2507.09318). - **Paper:** [https://arxiv.org/abs/2507.09318](https://arxiv.org/abs/2507.09318) - **GitHub:** [https://github.com/k2-fsa/ZipVoice](https://github.com/k2-fsa/ZipVoice) - **Project Page:** [https://zipvoice-dialog.github.io](https://zipvoice-dialog.github.io) OpenDialog is the first large-scale (6.8k hours) open-source spoken dialogue dataset derived from in-the-wild speech data. It consists of: - **English data:** 5074 hours - **Chinese data:** 1759 hours This dataset is also available at [ModelScope](https://www.modelscope.cn/datasets/k2-fsa/OpenDialog) (more friendly for users from China mainland). ## Citation ```bibtex @article{zhu2025zipvoicedialog, title={ZipVoice-Dialog: Non-Autoregressive Spoken Dialogue Generation with Flow Matching}, author={Zhu, Han and Kang, Wei and Guo, Liyong and Yao, Zengwei and Kuang, Fangjun and Zhuang, Weiji and Li, Zhaoqing and Han, Zhifeng and Zhang, Dong and Zhang, Xin and Song, Xingchen and Lin, Long and Povey, Daniel}, journal={arXiv preprint arXiv:2507.09318}, year={2025} } ```

# OpenDialog OpenDialog 是一款总时长达6.8千小时的口语对话数据集,相关研究成果刊载于论文《ZipVoice-Dialog》(https://arxiv.org/abs/2507.09318),详细信息请参阅该论文。 该数据集包含1759小时的中文语料与5074小时的英文语料。
提供机构:
maas
创建时间:
2025-07-16
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作