RenaAls/fox-nbx-headlines-2021-2025
收藏Hugging Face2025-12-16 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/RenaAls/fox-nbx-headlines-2021-2025
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含来自Fox News和NBC News的新闻标题文本,用于新闻来源分类。数据集包含两个CSV文件,分别标记为Fox News和NBC News的标题。每个CSV文件包含三列:url(原始完整URL)、title(从URL中提取的标题)和source(新闻来源,NBC或Fox)。数据是通过从Fox News和NBC News的存档和索引页面抓取URL,并从URL slugs中提取标题文本来收集的,以确保预处理的一致性。数据集的时间跨度为2021年至2025年,主要用于教育用途,特别是监督式NLP分类实验。
This dataset contains news headline text derived from URL slugs for the purpose of news source classification. The dataset includes two CSV files, labeled as Fox News and NBC News headlines. Each CSV file contains three columns: url (the original full URL), title (the stripped title from the URL), and source (the news source, NBC or Fox). The data was collected by scraping URLs from archive and index pages of Fox News and NBC News, and headline text was derived from URL slugs to ensure preprocessing consistency. The dataset covers the time period from 2021 to 2025 and is intended for educational use, specifically for supervised NLP classification experiments.
提供机构:
RenaAls



