AG’s Corpus
收藏arXiv2025-09-30 收录
下载链接:
http://groups.di.unipi.it/~gulli/AG_corpus_of_news_articles
下载链接
链接失效反馈官方服务:
资源简介:
该数据集名为AG’s Corpus,包含了超过100万篇新闻文章,本研究中使用的一个版本包含了超过12万篇标注文档,这些文档被组织成了四个类别。此外,该数据集还用于论文中的训练和评估工作,其任务是围绕预定义主题的文档检索。
The dataset is named AG’s Corpus, which contains over 1 million news articles. A version utilized in this study includes more than 120,000 annotated documents, which are categorized into four classes. Additionally, this dataset has been employed for training and evaluation in the paper, where the task centers on document retrieval around predefined topics.
提供机构:
groups.di.unipi.it



