JonathanZha/PADBen
收藏Hugging Face2025-11-02 更新2025-11-15 收录
下载链接:
https://hf-mirror.com/datasets/JonathanZha/PADBen
下载链接
链接失效反馈官方服务:
资源简介:
PADBen是一个用于评估AI生成文本检测方法的基准数据集,它特别设计用于测试在各种改写场景和攻击向量中的检测能力。数据集包含486,990个样本,分布在46个文件中,总扩展比为30.0倍。数据集包含10种不同的任务类型(5个单句任务和5个句子对任务)。数据集涵盖了改写源归属、文本作者识别、AI文本清洗、迭代改写深度检测和深度改写攻击检测等研究问题。
PADBen is a comprehensive benchmark for evaluating AI-generated text detection methods, specifically designed to test detection capabilities across various paraphrasing scenarios and attack vectors. The dataset includes 486,990 samples across 46 files, with a total expansion ratio of 30.0x. It covers 10 different task types (5 single-sentence and 5 sentence-pair tasks), addressing research questions such as paraphrase source attribution, general text authorship detection, AI text laundering, iterative paraphrase depth detection, and deep paraphrase attack detection.
提供机构:
JonathanZha



