saiteja33/BMAS

Name: saiteja33/BMAS
Creator: saiteja33
Published: 2025-09-26 09:22:58
License: 暂无描述

Hugging Face2025-09-26 更新2025-10-25 收录

下载链接：

https://hf-mirror.com/datasets/saiteja33/BMAS

下载链接

链接失效反馈

官方服务：

资源简介：

E-BMAS是一个英文语言数据集，用于二分类任务，区分人和机器生成的文本；多分类任务，不仅识别机器生成的文本，还尝试确定其生成器；对抗性攻击任务，研究降低机器生成文本检测性的常见行为；以及句子级别的分割任务，预测人和机器生成文本的边界。数据集包含了来自不同领域的文本，如Reddit、新闻、维基百科、arXiv、问答等，以及不同模型生成的文本，如Deepseek、OpenAI、Anthropic和Llama等。数据集还包含了对抗性攻击的数据，以增强模型的鲁棒性。

E-BMAS is an English language dataset designed for binary classification to distinguish between human and machine-generated text, multiclass classification that not only identifies machine-generated text but also attempts to determine its generator, adversarial attack tasks that study common acts to reduce the detectability of machine-generated text, and sentence-level segmentation tasks to predict the boundaries between human and machine-generated text. The dataset includes texts from various domains such as Reddit, news, Wikipedia, arXiv, Q&A, and texts generated by different models like Deepseek, OpenAI, Anthropic, and Llama. It also contains adversarially attacked data to enhance the robustness of the models.

提供机构：

saiteja33

5,000+

优质数据集

54 个

任务类型

进入经典数据集