five

abercowsky/autotrain-data-sexual-content-classification

收藏
Hugging Face2023-08-14 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/abercowsky/autotrain-data-sexual-content-classification
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集是用于性内容分类的AutoTrain数据集,包含文本和对应的标签,标签为0或1,表示文本是否包含性内容。数据集的结构包括数据实例、字段描述和数据集划分。数据实例展示了两个样本,一个标签为1,另一个为0。字段描述表明数据集有两个字段:text(文本内容)和target(分类标签)。数据集被划分为训练集和验证集,训练集包含568324个样本,验证集包含142082个样本。
提供机构:
abercowsky
原始信息汇总

AutoTrain Dataset for project: sexual-content-classification

数据集描述

该数据集由AutoTrain自动处理,用于项目sexual-content-classification。

语言

数据集的语言BCP-47代码为en。

数据集结构

数据实例

数据集的一个样本如下所示:

json [ { "text": "Youre No Good: was covered in a College babes fucks with the neighbor charting version by which soul singer and pianist?", "target": 1 }, { "text": "AT first u201cthe buttonu201d seemed like an April Foolsu2019 joke.

Now, 13 days later, Redditu2019s social experiment is still holding momentum.

The button feature was added to Reddit on April 1 and contains a timer which counts down from 60 seconds to zero.

However, every time the button is pushed, the timer is reset.

Although Reddit users have been speculating the reason for the experiment, no one knows its specific purpose.

Additionally, no one is aware what will happen when the countdown reaches zero because the timer is yet to fall below 29 seconds.

Redditors can only use the feature if they were a member of the website before April 1 and they can only push the button once.

As of this afternoon, over 711,000 members have pushed the button.

Since its inception, members have received coloured circles next to their username which indicate how long they waited to push the button.

Those who donu2019t push the button receive a grey circle, while those who give in to temptation receive circles ranging from purple all the way down to red.

To date, no one has waited past the time restrictions of yellow meaning there are no orange or red circles floating around Reddit.

However, one can only assume interest will eventually disappear and the true purpose of the button will be revealed.

Think about this: for the past 12 days someone on Earth has pressed a button every 30 sec or so. http://t.co/7ALLYeWB7i #TheButton @reddit u2014 Zachs Mind (@ZachsMind) April 13, 2015

The fact that Ive been watching #thebutton for 11 days is starting to concern me. u2014 ConvertToChris (@converttochris) April 12, 2015

I only date people who have not pressed #TheButton u2014 Pyro (@Pyrao) April 11, 2015", "target": 0 } ]

数据集字段

数据集包含以下字段(也称为“特征”):

json { "text": "Value(dtype=string, id=None)", "target": "ClassLabel(names=[0, 1], id=None)" }

数据集拆分

该数据集被拆分为训练集和验证集。拆分大小如下:

拆分名称 样本数量
train 568324
valid 142082
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作