five

ByteDance-Seed/WideSearch

收藏
Hugging Face2025-09-08 更新2025-08-09 收录
下载链接:
https://hf-mirror.com/datasets/ByteDance-Seed/WideSearch
下载链接
链接失效反馈
官方服务:
资源简介:
WideSearch是一个评估大型语言模型驱动的代理在广泛信息搜索任务中的能力的基准。它包含了200个任务,任务要求代理收集和整理大量分散的信息,并注重结果的完整性和事实准确性。这些任务不是在认知难度上挑战代理,而是在操作规模、重复性和结果的质量上提出要求。数据集分为中文和英文两部分,每个任务都有对应的地面真实答案。

WideSearch is a benchmark designed to evaluate the capabilities of Large Language Model (LLM) driven agents in broad information-seeking tasks. It consists of 200 meticulously designed tasks, half in English and half in Chinese, that require agents to gather and organize a large amount of scattered information, emphasizing the importance of completeness and factual fidelity in the final results. The tasks are not challenging in terms of cognitive difficulty but in operational scale, repetitiveness, and the quality of the outcome. Each task has corresponding ground-truth answers.
提供机构:
ByteDance-Seed
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作