Twitter Dialogue Dataset
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/marsan-ma/chat_corpus
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含了大约250万条从推特上收集的单轮对话,用于训练对话模型,以评估对话响应在性别方面的公平性。该数据集不仅用于训练生成式和检索式对话模型,而且是研究对话系统中公平性问题的关键资源。规模上,数据集包含了250万条单轮对话,任务重点在于基于性别的对话模型训练和响应公平性评估。
This dataset contains approximately 2.5 million single-turn conversations collected from Twitter, which is used for training dialogue models to evaluate the gender fairness of dialogue responses. It not only supports the training of generative and retrieval-based dialogue models, but also serves as a critical resource for researching fairness issues in dialogue systems. In terms of scale, the dataset includes 2.5 million single-turn conversations, with its core task focusing on gender-based dialogue model training and response fairness evaluation.
提供机构:
Twitter



