Douban Conversation Corpus 豆瓣会话语料库

超神经2023-12-25 更新2024-05-15 收录

下载链接：

https://hyper.ai/cn/datasets/28497

下载链接

链接失效反馈

官方服务：

资源简介：

本数据集包括一个训练数据集、一个开发集和一个基于检索的聊天机器人的测试集。测试数据包含 1,000 个对话上下文，对于每个上下文，研究人员创建 10 个响应作为候选。研究人员招募了三名标注员来判断候选人是否对会议做出了适当的回应，正确的响应意味着响应可以自然地回复给定上下文的消息。每对收到三个标签，大部分标签被视为最终决定。

This dataset comprises a training dataset, a development set, and a test set designed for retrieval-based chatbots. The test corpus contains 1,000 dialogue contexts. For each context, researchers curated ten candidate responses. Three human annotators were recruited to evaluate whether each candidate response is an appropriate reply for the meeting scenario. A correct response is defined as one that can naturally respond to the message within the given context. Each context-response pair obtained three annotation labels, and the majority vote was used as the final decision.

创建时间：

2023-12-25

搜集汇总

数据集介绍

背景与挑战

背景概述

Douban Conversation Corpus 是一个用于基于检索的聊天机器人的数据集，包含训练集、开发集和测试集。测试集提供了1,000个对话上下文，每个上下文对应10个候选响应，并由三名标注员评估响应适当性，以多数标签作为最终决定。

以上内容由遇见数据集搜集并总结生成