ByteDance-BandAI/ReportBench
收藏Hugging Face2025-09-04 更新2025-11-01 收录
下载链接:
https://hf-mirror.com/datasets/ByteDance-BandAI/ReportBench
下载链接
链接失效反馈官方服务:
资源简介:
ReportBench 是一个全面的基准测试,用于评估深度研究代理的事实质量和引用行为。该数据集包括提示、作者、bib_id、元信息等特征,以及 arxiv_id、标题、摘要等。它还包括对不同深度研究产品和其基础模型的评估结果。数据格式为 JSON Lines,每个对象描述一个任务及其地面真实引用。数据集根据 CC0-1.0 许可证获得许可,用于文本生成任务。语言为英语,数据集大小分类为 n<1K。
ReportBench is a comprehensive benchmark for evaluating the factual quality and citation behavior of Deep Research agents. The dataset includes features such as prompts, ground truth with author, bib_id, and meta_info, arxiv_id, title, abstract, and more. It also includes evaluation results for different Deep Research products and their base models. The data format is JSON Lines, with each object describing a task and its ground truth references. The dataset is licensed under CC0-1.0 and is used for text generation tasks. The language is English, and the dataset size is categorized as n<1K.
提供机构:
ByteDance-BandAI



