"Dataset for TMED"

Name: "Dataset for TMED"
Creator: IEEE DataPort
Published: 2026-04-19 11:04:53
License: 暂无描述

DataCite Commons2026-04-19 更新2026-05-03 收录

下载链接：

https://ieee-dataport.org/documents/dataset-tmed-0

下载链接

链接失效反馈

官方服务：

资源简介：

"Datasets. We employ eight publicly available benchmark datasets spanning diverse domains and content types for cross-domain emerging topic rumor detection. Five datasets serve as source domains: FEVER, a large-scale fact verification dataset containing short declarative statements; GettingReal and GossipCop, which consist of full-length news articles collected from real-world news outlets; LIAR, comprising short political statements from PolitiFact; and PHEME, which contains social media posts from Twitter related to breaking news events. Three datasets are used as target domains to simulate emerging topics: CoAID, a COVID-19 healthcare misinformation dataset covering news articles and claims, notable for its highly imbalanced label distribution (over 90% non-rumor); Constraint, a COVID-19 fake news dataset collected from social media with a nearly balanced class distribution; and ANTiVax, a Twitter dataset focusing on COVID-19 vaccine misinformation. All datasets are binary-labeled as rumor or non-rumor. The datasets vary considerably in average text length (from 9.4 to 738.9 tokens), content type (statements, news articles, and social network posts), and class distribution, providing a comprehensive testbed for evaluating cross-domain generalization. Following prior work, each dataset is split into training, validation, and test sets at a 7:2:1 ratio, and 15 source\u2013target adaptation scenarios are constructed by pairing each source dataset with each target dataset."

提供机构：

IEEE DataPort

创建时间：

2026-04-19

5,000+

优质数据集

54 个

任务类型

进入经典数据集