main data
收藏Figshare2025-04-30 更新2026-04-08 收录
下载链接:
https://figshare.com/articles/dataset/main_data/28881542/1
下载链接
链接失效反馈官方服务:
资源简介:
OverviewThis folder contains datasets and experimental results used in a research project on rumor generation, detection, and debunking. The core data was generated by two large language models—DeepSeek-R1 and qwq-32b—with additional detection results from DeepSeek-V3. The folder includes both direct model outputs and results derived from further analyses based on these outputs. The data is organized into several subfolders, each focusing on specific aspects of the research. Details of the analysis procedures are described in the accompanying manuscript.Folder Structure1. <b>deepseek-r1-debunking</b>This folder contains the results generated by the <b>DeepSeek-R1</b> model for debunking rumors. The files include:<b>R_readability_results.json</b>: Contains readability analysis results for the generated debunking texts.<b>sentiment_analysis_R.json</b>: Contains sentiment analysis results for the generated debunking texts.<b>R_debunking_texts.json</b>: Contains the debunking texts generated by the model.<b>R_debunking_texts_with_similarity.json</b>: Contains the debunking texts along with their similarity scores to the offical debunking texts.2. <b>deepseek-r1-detection</b>This folder contains the results of <b>DeepSeek-R1</b>'s detection of rumors in the <b>FakeNewsNet</b> and <b>Twitter1516</b> datasets. The files include:<b>DR1_detection_twitter1516.json</b>: Detection results for the <b>Twitter1516</b> dataset.<b>DR1_detection_fakenews.json</b>: Detection results for the <b>FakeNewsNet</b> dataset.3. <b>deepseek-r1-generation</b>This folder includes the generated rumors based on specific themes using the <b>DeepSeek-R1</b> model. The themes and corresponding files include:<b>entertainment.json</b>: Rumors generated on entertainment-related topics.<b>financial.json</b>: Rumors generated on financial-related topics.<b>health.json</b>: Rumors generated on health-related topics.<b>disaster-related.json</b>: Rumors generated on disaster-related topics.4. <b>deepseek-v3-detection</b>This folder contains the rumor detection results for the <b>FakeNewsNet</b> and <b>Twitter1516</b> datasets, generated by the updated <b>DeepSeek-V3</b> model. The files include:<b>v3_results_fakenews.json</b>: Detection results for the <b>FakeNewsNet</b> dataset.<b>v3_results_twitter1516.json</b>: Detection results for the <b>Twitter1516</b> dataset.5. <b>qwq-32b-debunking</b>This folder contains the results of the <b>qwq-32b</b> model for debunking rumors. The files include:<b>Q_debunking_texts_with_similarity.json</b>: Contains the debunking texts with similarity scores to the original content.<b>Q_sentiment_analysis.json</b>: Contains sentiment analysis results for the generated debunking texts.<b>Q_debunking_readability_results.json</b>: Contains readability analysis results for the generated debunking texts.<b>Q_debunking_texts.json</b>: Contains the debunking texts generated by the model.6. <b>qwq-32b-detection</b>This folder includes the detection results for <b>FakeNewsNet</b> and <b>Twitter1516</b> datasets, generated by the <b>qwq-32b</b> model. The files include:<b>Q_rumor_detection_results_fakenews.json</b>: Detection results for the <b>FakeNewsNet</b> dataset.<b>Q_rumor_detection_results_twitter1516.json</b>: Detection results for the <b>Twitter1516</b> dataset.7. <b>qwq-32b-generation</b>This folder contains the generated rumors based on specific themes using the <b>qwq-32b</b> model. The themes and corresponding files include:<b>entertainment.json</b>: Rumors generated on entertainment-related topics.<b>financial.json</b>: Rumors generated on financial-related topics.<b>health.json</b>: Rumors generated on health-related topics.<b>disaster.json</b>: Rumors generated on disaster-related topics.Data DescriptionThe following datasets were used in this research:<b>FakeNewsNet</b>: A widely used dataset consisting of fake news stories, which is employed for training and evaluating rumor detection models. This dataset includes news articles labeled as "fake" or "real," and is used in the detection phase of this study.<b>Twitter1516</b>: A dataset containing rumors and non-rumors from Twitter. It is used to evaluate both rumor detection and generation models. The dataset contains tweets labeled as either rumors or non-rumors, providing a benchmark for evaluating the performance of detection models.Both datasets are publicly available and were used to train, test, and evaluate the models in this study. Please refer to the original dataset publications for detailed information on their structure and labeling.<br><br>
提供机构:
胡, 叶锦轩
创建时间:
2025-04-28



