five

传播事件库与重大传播风险事件库

收藏
国家基础学科公共科学数据中心2026-01-30 收录
下载链接:
https://nbsdc.cn/general/dataDetail?id=697a32b6195d261c3361cd61&type=1
下载链接
链接失效反馈
官方服务:
资源简介:
传播事件库与重大传播风险事件库是一个面向全球多语种、跨区域的传播事件与风险事件数据库,旨在构建覆盖多平台、多特征、大规模的事件案例库,支持传播分析、风险识别、舆情监测等应用,为多元异构虚假与不良信息传播理论、模型与知识库的数据库支撑与指标要求。主要的数据类型有两种:一种是各社交媒体平台中发布的传播事件,一种是各社交媒体平台中发布的风险事件。在经过跨平台爬取、事件论元抽取、事件共指消解、风险事件识别等流程后,构建的数据库中共包含传播事件案例327386个,包含风险事件案例31898个,包含5012384051条数据,包含微博、微信、中国互联网联合辟谣平台、推特、人民网、人民日报、百度、腾讯新闻共8种平台,包含中文、英文、西班牙文共3类语种,包含事件内容、发布时间、发布地点、发生区域、发布链接、包含模态、事件类型、发布用户、发布工具、所属语种、发布平台、点赞数量、评论数量、转发数量、是否属于风险事件、事件索引、前驱事件索引、事件含有的数据数量共18种特征,数据以JSONL格式存储,数据量约810MB。

The Dissemination Event and Major Dissemination Risk Event Repository is a global multilingual, cross-regional database focused on dissemination and risk events. It aims to build a large-scale, multi-platform, multi-feature event case base, supporting applications such as dissemination analysis, risk identification, and public opinion monitoring, and providing database support and indicator requirements for theories, models, and knowledge bases related to multi-heterogeneous false and harmful information dissemination. There are two primary data categories: dissemination events published across various social media platforms, and risk events published on such platforms. After completing workflows including cross-platform crawling, event argument extraction, event coreference resolution, and risk event identification, the repository houses 327,386 dissemination event cases, 31,898 risk event cases, and a total of 5,012,384,051 data entries. The repository covers 8 platforms: Weibo, WeChat, China Internet Joint Rumor-Busting Platform, Twitter, People's Daily Online, People's Daily, Baidu, and Tencent News. The supported languages include three categories: Chinese, English, and Spanish. Each event is annotated with 18 features: event content, release time, release location, occurrence region, release link, included modalities, event type, publishing user, publishing tool, associated language, publishing platform, number of likes, number of comments, number of reposts, whether it is a risk event, event index, predecessor event index, and number of data entries contained in the event. The data is stored in JSONL format, with a total size of approximately 810 MB.
提供机构:
北京理工大学
搜集汇总
数据集介绍
main_image_url
背景与挑战
背景概述
该数据集是一个全球多语种、跨区域的传播事件与风险事件数据库,旨在支持传播分析、风险识别和舆情监测等应用。它包含超过32万个传播事件和3万个风险事件,数据来源于微博、微信、推特等8种平台,覆盖中文、英文和西班牙文,具有18种详细特征,以JSONL格式存储,总数据量约810MB。
以上内容由遇见数据集搜集并总结生成
二维码
社区交流群
二维码
科研交流群
商业服务