CUDRT

Name: CUDRT
Creator: 中国人民大学信息学院
Published: 2024-06-13 20:43:40
License: 暂无描述

arXiv2024-06-13 更新2024-06-21 收录

下载链接：

https://github.com/TaoZhen1110/CUDRT Benchmark

下载链接

链接失效反馈

官方服务：

资源简介：

CUDRT数据集是由中国人民大学信息学院等机构创建，旨在评估人工智能生成文本检测器的性能。该数据集包含中英文两种语言，涵盖新闻文章和学术论文，总计约40000条数据。创建过程中，使用了多种主流大型语言模型生成文本，以模拟真实世界中的文本生成场景。CUDRT数据集的应用领域主要集中在信息安全、版权保护和伦理问题，特别是解决人工智能生成文本与人类创作文本难以区分的问题。

The CUDRT dataset was developed by the School of Information, Renmin University of China and other institutions, with the aim of evaluating the performance of AI-generated text detectors. It contains texts in both Chinese and English, spanning news articles and academic papers, with a total of approximately 40,000 instances. During its development, multiple mainstream large language models were utilized to generate texts, simulating real-world text generation scenarios. The CUDRT dataset is primarily applied in the fields of information security, copyright protection and ethical issues, particularly addressing the challenge of distinguishing between AI-generated texts and human-authored works.

提供机构：

中国人民大学信息学院

创建时间：

2024-06-13

5,000+

优质数据集

54 个

任务类型

进入经典数据集