IntelLabs/AI-Peer-Review-Detection-Benchmark

Name: IntelLabs/AI-Peer-Review-Detection-Benchmark
Creator: IntelLabs
Published: 2025-05-27 15:44:37
License: 暂无描述

Hugging Face2025-05-27 更新2025-05-31 收录

下载链接：

https://hf-mirror.com/datasets/IntelLabs/AI-Peer-Review-Detection-Benchmark

下载链接

链接失效反馈

官方服务：

资源简介：

AI同行评审检测基准数据集是目前为止最大的包含人工和AI编写的针对相同研究论文的成对同行评审数据集。它由两个领先的人工智能研究会议：ICLR和NeurIPS的8年的提交论文中生成的788,984条评审组成。每个AI生成的评审都是使用五种广泛使用的大型语言模型（LLM）之一生成的，包括GPT-4o、Claude Sonnet 3.5、Gemini 1.5 Pro、Qwen 2.5 72B和Llama 3.1 70B，并与相应的人工编写的评审配对。数据集包括多个子集（校准、测试、扩展），以支持对AI生成文本检测方法的系统评估。

The AI Peer Review Detection Benchmark dataset is the largest to date of paired human- and AI-written peer reviews for identical research papers. It consists of 788,984 reviews generated for 8 years of submissions to two leading AI research conferences: ICLR and NeurIPS. Each AI-generated review is produced using one of five widely-used large language models (LLMs), including GPT-4o, Claude Sonnet 3.5, Gemini 1.5 Pro, Qwen 2.5 72B, and Llama 3.1 70B, and is paired with corresponding human-written reviews. The dataset includes multiple subsets (calibration, test, extended) to support systematic evaluation of AI-generated text detection methods.

提供机构：

IntelLabs

5,000+

优质数据集

54 个

任务类型

进入经典数据集