RoFT

Name: RoFT
Creator: University of Pennsylvania
License: 暂无描述

arXiv2025-09-30 收录

下载链接：

https://github.com/liamdugan/human-detection

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集包含了超过21,000个人类标注，这些标注与错误分类相匹配，旨在研究人类识别由人类编写和机器生成文本之间转换的能力。数据集涵盖了多种文风，如新闻文章、总统演讲、来自Reddit的虚构故事以及Recipe1M+的食谱等，旨在推动未来在人类检测和评估生成文本方面的研究工作。该数据集规模庞大，拥有超过21,000个标注，其任务重点是识别人类编写文本与机器生成文本之间的边界。

This dataset comprises over 21,000 human annotations matched with misclassification cases, targeting the investigation of human ability to distinguish between human-written and machine-generated text. Covering a wide range of writing styles such as news articles, presidential speeches, fictional stories from Reddit, and recipes from Recipe1M+, this large-scale dataset focuses on the task of identifying the boundary between human-written and machine-generated text, aiming to advance future research on human detection and evaluation of generated text.

提供机构：

University of Pennsylvania

5,000+

优质数据集

54 个

任务类型

进入经典数据集