waashk/20ng
收藏Hugging Face2025-04-02 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/waashk/20ng
下载链接
链接失效反馈官方服务:
资源简介:
20NewsGroup数据集是一个文本分类数据集,包含了新闻文章文本及其对应的编码标签。该数据集用于自动文本分类的全面基准测试,涵盖了从传统方法到大型语言模型的方法。数据集按照MIT许可证发布,包含大约10K到100K条数据。数据集提供了交叉验证的训练-测试划分,并包含了相应的pandas DataFrame格式的数据文件。
The 20NewsGroup dataset is a text classification dataset containing news articles texts and their associated encoded labels. It is used for a thorough benchmark of automatic text classification, ranging from traditional approaches to large language models. The dataset is released under the MIT license and contains approximately 10K to 100K entries. The dataset provides cross-validation train-test splits and includes data files in the pandas DataFrame format.
提供机构:
waashk



