HybridDialogue

Name: HybridDialogue
Creator: maas
Published: 2026-01-06 16:50:40
License: 暂无描述

魔搭社区2026-01-06 更新2025-12-06 收录

下载链接：

https://modelscope.cn/datasets/cerebras/HybridDialogue

下载链接

链接失效反馈

官方服务：

资源简介：

# Dataset Information A pre-processed version of the HybridDialogue dataset. The dataset was created as part of our work on Cerebras DocChat - a document-based conversational Q&A model. This dataset is intended to be used for training purposes, and so overlapping samples with the HybridDialogue test set in [ChatRAG](https://huggingface.co/datasets/nvidia/ChatRAG-Bench) have been removed. Each sample in this dataset contains a `messages` multi-turn conversation, a `document` which is a concatenated representation of relevant document(s), and `answers` for the current turn. # Acknowledgement This dataset is a processed version of the HybridDialogue dataset. ``` @inproceedings{nakamura2022hybridialogue, title={HybriDialogue: An Information-Seeking Dialogue Dataset Grounded on Tabular and Textual Data}, author={Nakamura, Kai and Levy, Sharon and Tuan, Yi-Lin and Chen, Wenhu and Wang, William Yang}, booktitle={Findings of the Association for Computational Linguistics: ACL 2022}, year={2022} } ```

# 数据集信息本数据集为HybridDialogue数据集的预处理版本，是我们针对Cerebras DocChat（一款基于文档的会话式问答模型）开展的研究工作的组成部分。本数据集仅用于训练场景，因此已移除与[ChatRAG](https://huggingface.co/datasets/nvidia/ChatRAG-Bench)中的HybridDialogue测试集存在样本重叠的条目。本数据集的每条样本均包含`messages`字段（多轮会话内容）、`document`字段（相关文档的拼接表示）以及`answers`字段（当前轮次的回答）。 # 致谢本数据集为HybridDialogue数据集的处理后版本。 @inproceedings{nakamura2022hybridialogue, title={HybriDialogue: An Information-Seeking Dialogue Dataset Grounded on Tabular and Textual Data}, author={Nakamura, Kai and Levy, Sharon and Tuan, Yi-Lin and Chen, Wenhu and Wang, William Yang}, booktitle={Findings of the Association for Computational Linguistics: ACL 2022}, year={2022} }

提供机构：

maas

创建时间：

2025-10-23

5,000+

优质数据集

54 个

任务类型

进入经典数据集