five

Charlotte25/WildChat-1M

收藏
Hugging Face2026-04-24 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/Charlotte25/WildChat-1M
下载链接
链接失效反馈
官方服务:
资源简介:
WildChat是一个包含100万条人类用户与ChatGPT对话的集合,附带包括用户状态、国家、哈希IP地址和请求头等在内的详细人口统计数据。该数据集通过向在线用户免费提供OpenAI的GPT-3.5和GPT-4收集而成,其中25.53%的对话来自GPT-4聊天机器人,其余来自GPT-3.5聊天机器人。数据集涵盖了广泛的用户与聊天机器人互动场景,如模糊用户请求、代码转换、话题转换、政治讨论等,这些场景在其他指令微调数据集中未被充分覆盖。WildChat既可作为指令微调的数据集,也可作为研究用户行为的宝贵资源。注意,此版本的数据集仅包含非毒性用户输入/ChatGPT响应。

WildChat is a collection of 1 million conversations between human users and ChatGPT, alongside demographic data, including state, country, hashed IP addresses, and request headers. We collected WildChat by offering online users free access to OpenAIs GPT-3.5 and GPT-4. In this version, 25.53% of the conversations come from the GPT-4 chatbot, while the rest come from the GPT-3.5 chatbot. The dataset contains a broad spectrum of user-chatbot interactions that are not previously covered by other instruction fine-tuning datasets: for example, interactions include ambiguous user requests, code-switching, topic-switching, political discussions, etc. WildChat can serve both as a dataset for instructional fine-tuning and as a valuable resource for studying user behaviors. Note that this version of the dataset only contains non-toxic user inputs/ChatGPT responses.
提供机构:
Charlotte25
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作