AlexanderHolmes0/true-fake-news

Name: AlexanderHolmes0/true-fake-news
Creator: AlexanderHolmes0
Published: 2024-04-12 13:44:07
License: 暂无描述

Hugging Face2024-04-12 更新2024-06-22 收录

下载链接：

https://hf-mirror.com/datasets/AlexanderHolmes0/true-fake-news

下载链接

链接失效反馈

官方服务：

资源简介：

--- language: - en license: mit size_categories: - 10K<n<100K task_categories: - text-classification - question-answering - text-generation dataset_info: features: - name: label dtype: class_label: names: '0': 'true' '1': fake - name: text dtype: string splits: - name: train num_bytes: 82978144 num_examples: 33672 - name: test num_bytes: 28512596 num_examples: 11224 download_size: 67949019 dataset_size: 111490740 configs: - config_name: default data_files: - split: train path: data/train-* - split: test path: data/test-* tags: - news --- # True-Fake-News  These are collected news articles from various sources with curated labels aligning to `true` of `fake` classification. ### Dataset Description  The dataset contains two types of articles fake and real News. This dataset was collected from realworld sources; the truthful articles were obtained by crawling articles from Reuters.com (News website). As for the fake news articles, they were collected from different sources. The fake news articles were collected from unreliable websites that were flagged by Politifact (a fact-checking organization in the USA) and Wikipedia. The dataset contains different types of articles on different topics, however, the majority of articles focus on political and World news topics. ### Dataset Sources [optional]  - **Repository:** [Kaggle Repo](https://www.kaggle.com/datasets/emineyetm/fake-news-detection-datasets/data) ## Uses  Text classification or question answering would be ways to use this dataset. ## Dataset Structure | Classification | Total Number of Articles | Article Type | Article Count | |----------------|--------------------------|--------------|---------------| | Real-News | 21,417 | World | 10,145 | | | | Political | 11,272 | | Fake-News | 23,481 | Government | 1,570 | | | | Middle East | 778 | | | | US | 783 | | | | Left-Leaning | 4,459 | | | | Political | 6,841 | | | | General | 9,050 |

提供机构：

AlexanderHolmes0

原始信息汇总

数据集概述

基本信息

语言: 英语
许可: MIT
大小类别: 10K<n<100K
任务类别:
- 文本分类
- 问答
- 文本生成

数据集详情

特征:
- label: 分类标签，包含两个类别：true 和 fake
- text: 文本内容，数据类型为字符串
分割:
- train: 训练集，包含33672个样本，大小为82978144字节
- test: 测试集，包含11224个样本，大小为28512596字节
下载大小: 67949019字节
数据集大小: 111490740字节

配置

默认配置:
- 训练集路径: data/train-*
- 测试集路径: data/test-*

数据集描述

该数据集包含来自不同来源的新闻文章，标签分为true和fake两类。真实新闻文章来自Reuters.com，而假新闻文章来自被Politifact和Wikipedia标记为不可靠的网站。数据集主要涵盖政治和世界新闻主题。

使用场景

该数据集适用于文本分类和问答任务。

数据集结构

分类	文章总数	文章类型	文章数量
真实新闻	21,417	世界	10,145
		政治	11,272
假新闻	23,481	政府	1,570
		中东	778
		美国	783
		左倾	4,459
		政治	6,841
		一般	9,050

5,000+

优质数据集

54 个

任务类型

进入经典数据集