Guba_jsondata
收藏魔搭社区2025-08-19 更新2025-08-23 收录
下载链接:
https://modelscope.cn/datasets/Maiawen/Guba_jsondata
下载链接
链接失效反馈官方服务:
资源简介:
rawdata经过数据清洗的版本(乱码、链接、转义符等),原json_clean文件夹;json_data文件夹(第二版,第一版只是将raw_data转化为json)数据是基于json_clean通过筛除comments计数为0剩下的发帖,即只考虑有评论的发帖,因而只给json_clean数据已经足够。
The cleaned version of rawdata, with garbled text, hyperlinks, escape characters, and other irrelevant content removed, is stored in the json_clean folder. The json_data folder (second edition; the first edition only converted raw_data into JSON format) contains posts filtered from the json_clean dataset by excluding entries with 0 comment counts, meaning only posts that received comments are retained. Consequently, providing only the json_clean data is sufficient.
提供机构:
maas
创建时间:
2025-07-18



