main data

Name: main data
Creator: 胡, 叶锦轩
Published: 2025-04-30 00:00:00
License: 暂无描述

Figshare2025-04-30 更新2026-04-08 收录

下载链接：

https://figshare.com/articles/dataset/main_data/28881542/1

下载链接

链接失效反馈

官方服务：

资源简介：

OverviewThis folder contains datasets and experimental results used in a research project on rumor generation, detection, and debunking. The core data was generated by two large language models—DeepSeek-R1 and qwq-32b—with additional detection results from DeepSeek-V3. The folder includes both direct model outputs and results derived from further analyses based on these outputs. The data is organized into several subfolders, each focusing on specific aspects of the research. Details of the analysis procedures are described in the accompanying manuscript.Folder Structure1. deepseek-r1-debunkingThis folder contains the results generated by the DeepSeek-R1 model for debunking rumors. The files include:R_readability_results.json: Contains readability analysis results for the generated debunking texts.sentiment_analysis_R.json: Contains sentiment analysis results for the generated debunking texts.R_debunking_texts.json: Contains the debunking texts generated by the model.R_debunking_texts_with_similarity.json: Contains the debunking texts along with their similarity scores to the offical debunking texts.2. deepseek-r1-detectionThis folder contains the results of DeepSeek-R1's detection of rumors in the FakeNewsNet and Twitter1516 datasets. The files include:DR1_detection_twitter1516.json: Detection results for the Twitter1516 dataset.DR1_detection_fakenews.json: Detection results for the FakeNewsNet dataset.3. deepseek-r1-generationThis folder includes the generated rumors based on specific themes using the DeepSeek-R1 model. The themes and corresponding files include:entertainment.json: Rumors generated on entertainment-related topics.financial.json: Rumors generated on financial-related topics.health.json: Rumors generated on health-related topics.disaster-related.json: Rumors generated on disaster-related topics.4. deepseek-v3-detectionThis folder contains the rumor detection results for the FakeNewsNet and Twitter1516 datasets, generated by the updated DeepSeek-V3 model. The files include:v3_results_fakenews.json: Detection results for the FakeNewsNet dataset.v3_results_twitter1516.json: Detection results for the Twitter1516 dataset.5. qwq-32b-debunkingThis folder contains the results of the qwq-32b model for debunking rumors. The files include:Q_debunking_texts_with_similarity.json: Contains the debunking texts with similarity scores to the original content.Q_sentiment_analysis.json: Contains sentiment analysis results for the generated debunking texts.Q_debunking_readability_results.json: Contains readability analysis results for the generated debunking texts.Q_debunking_texts.json: Contains the debunking texts generated by the model.6. qwq-32b-detectionThis folder includes the detection results for FakeNewsNet and Twitter1516 datasets, generated by the qwq-32b model. The files include:Q_rumor_detection_results_fakenews.json: Detection results for the FakeNewsNet dataset.Q_rumor_detection_results_twitter1516.json: Detection results for the Twitter1516 dataset.7. qwq-32b-generationThis folder contains the generated rumors based on specific themes using the qwq-32b model. The themes and corresponding files include:entertainment.json: Rumors generated on entertainment-related topics.financial.json: Rumors generated on financial-related topics.health.json: Rumors generated on health-related topics.disaster.json: Rumors generated on disaster-related topics.Data DescriptionThe following datasets were used in this research:FakeNewsNet: A widely used dataset consisting of fake news stories, which is employed for training and evaluating rumor detection models. This dataset includes news articles labeled as "fake" or "real," and is used in the detection phase of this study.Twitter1516: A dataset containing rumors and non-rumors from Twitter. It is used to evaluate both rumor detection and generation models. The dataset contains tweets labeled as either rumors or non-rumors, providing a benchmark for evaluating the performance of detection models.Both datasets are publicly available and were used to train, test, and evaluate the models in this study. Please refer to the original dataset publications for detailed information on their structure and labeling.

提供机构：

胡, 叶锦轩

创建时间：

2025-04-28

5,000+

优质数据集

54 个

任务类型

进入经典数据集