VISTA

Name: VISTA
Creator: maas
Published: 2025-10-13 16:43:24
License: 暂无描述

魔搭社区2025-10-13 更新2025-10-04 收录

下载链接：

https://modelscope.cn/datasets/iostream11/VISTA

下载链接

链接失效反馈

官方服务：

资源简介：

This dataset consists of 160 structurally identical JSON files; each file is an array representing a complete social thread (a main post plus its comments and optional sub-comments). The main post object typically includes `mid`, `text`, `user_link`, `date` (with timezone), `reply_count`, `type`, `emotion`, `reason`, and `replies`. Each entry in `replies` is a comment object with fields such as `comment_id`, `comment_text`, `created_at` (MM-DD), `source`, `user_id`, `user_name`, `like_count`, `reply_id`, `reply_text`, `type`, `emotion`, and `reason`, and may itself contain a `replies` array, forming a multi-level comment tree. All text is UTF-8 and may include emojis, hashtags, mentions, and mixed Chinese/English; some annotation fields can be empty, and platform counters like `reply_count` may slightly differ from the number of captured comments.

本数据集包含160个结构完全一致的JSON文件；每个文件均为一个数组，代表一条完整的社交话题串（即一条主帖及其附带的评论与可选的子评论）。主帖对象通常包含`mid`、`text`、`user_link`、`date`（带时区信息）、`reply_count`、`type`、`emotion`、`reason`以及`replies`等字段。`replies`数组中的每一项均为一条评论对象，包含`comment_id`、`comment_text`、`created_at`（格式为MM-DD）、`source`、`user_id`、`user_name`、`like_count`、`reply_id`、`reply_text`、`type`、`emotion`及`reason`等字段，且该评论对象自身亦可包含`replies`数组，从而形成多级评论树结构。所有文本均采用UTF-8编码，可包含表情符号、话题标签、@提及内容以及中英混合文本；部分标注字段可为空，且诸如`reply_count`这类平台统计数值可能与实际抓取到的评论数量存在细微偏差。

提供机构：

maas

创建时间：

2025-09-27

搜集汇总

数据集介绍