SubMaroon/DTF_Comments_Responses_Counts
收藏Hugging Face2025-02-04 更新2025-02-15 收录
下载链接:
https://hf-mirror.com/datasets/SubMaroon/DTF_Comments_Responses_Counts
下载链接
链接失效反馈官方服务:
资源简介:
这个数据集包含了来自网站DTF.ru,从2016年中期到2024年底的数据。数据集的结构包括帖子标题、父评论、父评论作者、子评论、子评论作者、子版块名称、父评论ID、子评论ID、子评论回复的父评论ID、父评论点赞数、子评论点赞数、父评论回复数、父评论回复数的标准化值、父评论的毒性指标和子评论的毒性指标。数据集中的评论符号包括英文字母、俄文字母、数字和标点符号。评论经过筛选,只允许至少30个字符且至少有5个赞的评论。数据集未经过清理,包含许多重复行,其中包括一些标记为已删除的评论。
This dataset contains data from the website DTF.ru, ranging from mid-2016 to the end of 2024. The structure of the dataset includes post title, parent comment, parent comment author, child comment, child comment author, subsite name, parent comment ID, child comment ID, ID of the parent comment that the child comment responds to, number of likes on the parent comment, number of likes on the child comment, number of replies to the parent comment, normalized value of the number of replies to the parent comment, toxicity metric for the parent comment, and toxicity metric for the child comment. The symbols in the comments include English and Russian letters, numbers, and punctuation marks. Comments have been filtered to only include those with at least 30 characters and at least 5 likes. The dataset is not cleaned and contains many duplicate rows, including some comments marked as deleted.
提供机构:
SubMaroon



