five

saiteja33/BMAS

收藏
Hugging Face2025-09-26 更新2025-10-25 收录
下载链接:
https://hf-mirror.com/datasets/saiteja33/BMAS
下载链接
链接失效反馈
官方服务:
资源简介:
E-BMAS是一个英文语言数据集,用于二分类任务,区分人和机器生成的文本;多分类任务,不仅识别机器生成的文本,还尝试确定其生成器;对抗性攻击任务,研究降低机器生成文本检测性的常见行为;以及句子级别的分割任务,预测人和机器生成文本的边界。数据集包含了来自不同领域的文本,如Reddit、新闻、维基百科、arXiv、问答等,以及不同模型生成的文本,如Deepseek、OpenAI、Anthropic和Llama等。数据集还包含了对抗性攻击的数据,以增强模型的鲁棒性。

E-BMAS is an English language dataset designed for binary classification to distinguish between human and machine-generated text, multiclass classification that not only identifies machine-generated text but also attempts to determine its generator, adversarial attack tasks that study common acts to reduce the detectability of machine-generated text, and sentence-level segmentation tasks to predict the boundaries between human and machine-generated text. The dataset includes texts from various domains such as Reddit, news, Wikipedia, arXiv, Q&A, and texts generated by different models like Deepseek, OpenAI, Anthropic, and Llama. It also contains adversarially attacked data to enhance the robustness of the models.
提供机构:
saiteja33
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作