BEIR
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/ukplab/beir
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含了来自不同媒体的979篇当前新闻文章,分布在六个不同的类别中,重点收录了2023年7月19日之后发布的文章。为了遮罩预测任务,我们还创建了一个包含130篇新闻文章的子集。整个数据集被用于新闻分类任务。规模上,共有979篇文章涉及到了遮罩预测和新闻分类两项任务。
This dataset contains 979 current news articles from a variety of media sources, categorized into six distinct categories, with a focus on articles published after July 19, 2023. To support the mask prediction task, we additionally developed a subset containing 130 news articles. The entire dataset is utilized for news classification tasks. In terms of scale, a total of 979 articles are involved in both the mask prediction and news classification tasks.



