Chai User Response Dataset
收藏arXiv2025-09-30 收录
下载链接:
https://huggingface.co/hakurei/lit-6B
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含了5000万个以聊天机器人回应结束的局部对话,标注信息包括对话轮次、后续用户消息数量、是否请求重新生成以及用户的星级评价。此外,每个完整的对话在数据集中多次出现,每次代表聊天机器人的一个轮次。该数据集的公开发布旨在促进对构建引人入胜的聊天机器人的进一步研究。规模上,数据集包含了5000万个部分对话,其任务旨在训练一个奖励模型,以提升聊天机器人的互动性和回应质量。
This dataset contains 50 million partial conversations ending with a chatbot response, with annotations including conversation turn count, number of subsequent user messages, whether a regeneration request was made, and user star ratings. Furthermore, each full conversation appears multiple times in the dataset, with each occurrence corresponding to one chatbot response turn. This dataset is publicly released to facilitate further research on building engaging chatbots. In terms of scale, the dataset consists of 50 million partial conversations, and its intended task is to train a reward model to improve the interactivity and response quality of chatbots.
提供机构:
Chai Research



