five

xinjiehu76/cis5190-25f-projectb-8k5-combined

收藏
Hugging Face2025-12-13 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/xinjiehu76/cis5190-25f-projectb-8k5-combined
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集是为宾夕法尼亚大学CIS 5190(应用机器学习)2025年秋季项目B创建的,用于新闻来源分类任务。包含8,500个样本,每个样本对应特定新闻媒体(如Fox新闻、NBC新闻)发布的文章URL。数据以Excel文件格式提供,主要包含url列。分类标签(新闻来源)从URL域名中提取。数据集仅用于教育和研究目的,特别是用于CIS 5190课程中新闻来源分类模型的训练和评估。注意:数据集仅包含URL级别信息,不包含完整文章文本,且可能反映原始新闻源的偏见。

This dataset was created for CIS 5190 (Applied Machine Learning), Fall 2025, Project B at the University of Pennsylvania. The dataset is designed for a news source classification task. It consists of 8,500 samples, where each sample contains a single URL corresponding to a news article published by a specific news outlet (e.g., Fox News, NBC News). The dataset is provided as a single Excel file with a primary url column. The classification label (news source) is derived from the domain name of the URL. Intended for educational and research purposes, specifically for training and evaluating machine learning models for news source classification in CIS 5190. Limitations: contains only URL-level information without full article text, and may reflect biases present in the original news sources.
提供机构:
xinjiehu76
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作