20 news group (20ng)
收藏NIAID Data Ecosystem2026-03-14 收录
下载链接:
https://zenodo.org/record/5256620
下载链接
链接失效反馈官方服务:
资源简介:
20 Newsgroups (20NG) is a classical and popular dataset for experiments in text applications of machine learning techniques. It contains 18,846 newsgroups documents, partitioned almost evenly across 20 different newsgroups categories.
http://qwone.com/~jason/20Newsgroups/
The files:
texts.txt: Document set (text). One per line.
score.txt: Document class whose index is associated with texts.txt
split_.pkl: pandas DataFrame with k-cross validation partition.
Label Definition: (Score File)
0 atheist resources
1 computer graphics
2 computer os ms windows misc
3 computer system ibm pc hardware
4 computer system mac hardware
5 computer windows x
6 misc miscellaneous for sale
7 rec autos
8 rec motorcycles
9 rec sport baseball
10 rec sport hockey
11 science crypt
12 science electronics
13 science med
14 science space
15 society religion christian
16 talk politics guns
17 talk politics mideast
18 talk politics misc miscellaneous
19 talk religion misc miscellaneous
创建时间:
2023-01-20



