five

AG's news corpus (AGNEWS)

收藏
NIAID Data Ecosystem2026-03-14 收录
下载链接:
https://zenodo.org/record/5256677
下载链接
链接失效反馈
官方服务:
资源简介:
AG’s news corpus (AGNEWS): This AG’s corpus of news articles was collected from the web. The whole corpus contains 496,835 categorized news articles from more than 2000 news sources. Four largest classes (World, Sports, Business and Sci/Tech) from this corpus were chosen to construct the dataset used in the experiments, using only the title and description fields The files: texts.txt: Document set (text). One per line. score.txt: Document class whose index is associated with texts.txt split_.pkl:  pandas DataFrame with k-cross validation partition. The .zip contains all aforementioned files + the tfidf representation in the CSR matrix format.
创建时间:
2023-01-21
二维码
社区交流群
二维码
科研交流群
商业服务