JonathanZha/PADBen-Task1

Name: JonathanZha/PADBen-Task1
Creator: JonathanZha
Published: 2025-10-12 08:22:50
License: 暂无描述

Hugging Face2025-10-12 更新2025-10-25 收录

下载链接：

https://hf-mirror.com/datasets/JonathanZha/PADBen-Task1

下载链接

链接失效反馈

官方服务：

资源简介：

PADBen Task 1是一个二分类数据集，旨在区分人类作者和LLM生成的改写文本。该数据集包含16,233个句子，其中80%用于训练，20%用于测试，并且还有一个未标记的测试子集。每个样本包含一个句子和一个二进制标签，0代表人类作者，1代表机器生成。数据集采用50-50的平衡采样方法，并且格式为每个样本包含一个句子和一个二进制标签。README中还提供了数据集的使用方法和评估指标。

PADBen Task 1 is a binary classification dataset for distinguishing between human-authored and LLM-generated paraphrases. This dataset contains 16,233 sentences, with 80% for training and 20% for testing, and also includes an unlabeled test subset. Each sample consists of a sentence and a binary label, with 0 representing human authors and 1 representing machine-generated text. The dataset uses a 50-50 balanced sampling method and is formatted with a single sentence and a binary label per sample. The README provides instructions for using the dataset and evaluation metrics.

提供机构：

JonathanZha

5,000+

优质数据集

54 个

任务类型

进入经典数据集