madhaviit/corybooker_comments
收藏Hugging Face2024-02-15 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/madhaviit/corybooker_comments
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: 'Unnamed: 0'
dtype: int64
- name: post_id
dtype: string
- name: comment_id
dtype: int64
- name: comment_url
dtype: string
- name: commenter_id
dtype: int64
- name: commenter_name
dtype: string
- name: comment_text
dtype: string
- name: comment_time
dtype: string
- name: comment_image
dtype: string
- name: comment_reactors
dtype: string
- name: spam
dtype: string
- name: hate
dtype: string
splits:
- name: train
num_bytes: 2520157
num_examples: 5300
download_size: 896481
dataset_size: 2520157
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
---
This dataset is primarily used for analyzing comment data on social media, including detailed information and attributes of comments such as comment text, commenter information, comment time, etc., and marks whether the comment is spam or hate speech. The dataset provides a training set with 5300 samples.
提供机构:
madhaviit
原始信息汇总
数据集信息
特征
- Unnamed: 0: 数据类型为
int64 - post_id: 数据类型为
string - comment_id: 数据类型为
int64 - comment_url: 数据类型为
string - commenter_id: 数据类型为
int64 - commenter_name: 数据类型为
string - comment_text: 数据类型为
string - comment_time: 数据类型为
string - comment_image: 数据类型为
string - comment_reactors: 数据类型为
string - spam: 数据类型为
string - hate: 数据类型为
string
数据分割
- train: 包含 5300 个样本,占用 2520157 字节
数据集大小
- 下载大小: 896481 字节
- 数据集大小: 2520157 字节
配置
- default: 包含训练数据文件,路径为
data/train-*



