lmarena-ai/search-arena-24k
收藏Hugging Face2026-03-03 更新2025-05-31 收录
下载链接:
https://hf-mirror.com/datasets/lmarena-ai/search-arena-24k
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含了2025年3月18日至2025年5月8日期间,从Search Arena平台收集的所有野外对话。数据集由24,069个多轮对话组成,这些对话涉及不同的意图、语言和主题,并与搜索增强语言模型进行。此外,还包括了12,652个由人类提供的偏好投票。该数据集覆盖了大约11,000名来自136个国家的用户,13个公开发布的模型,大约90种语言(包括11%的多语言提示),以及超过5,000个多轮对话会话。每个数据点包括两个标准化模型响应、人类投票结果(一半的数据点具有此功能)、时间戳、完整的系统元数据和后处理注释,如语言和用户意图。
This dataset contains ALL in-the-wild conversations crowdsourced from Search Arena between March 18, 2025 and May 8, 2025. It consists of 24,069 multi-turn conversations with search-LLMs across diverse intents, languages, and topics, along with 12,652 human preference votes. The dataset spans approximately 11,000 users across 136 countries, 13 publicly released models, around 90 languages (including 11% multilingual prompts), and over 5,000 multi-turn sessions. Each data point includes two standardized model responses, the human vote result (available for half of the data points), timestamp, full system metadata, LLM + web search trace, and post-processed annotations such as language and user intent.
提供机构:
lmarena-ai



