five

all-subjects

收藏
arXiv2025-09-30 收录
下载链接:
https://github.com/jd-coderepos/llms4subjects/tree/main/shared-task-datasets/TIBKAT/all-subjects
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集包含了完整的TIBKAT开源集合,其中包括为训练和开发预定义的分割。参与者收到了伴随的GND主题分类法,该分类法包含204,739个主题。在规模上,训练记录有81,937条,开发记录有13,666条。该任务是对科学技术记录进行自动主题标注。

This dataset contains the complete open-source TIBKAT collection, which includes pre-defined splits for training and development. Participants were provided with the accompanying GND subject taxonomy, which encompasses 204,739 topics. In terms of scale, there are 81,937 training records and 13,666 development records. The task is automatic subject annotation for scientific and technical records.
提供机构:
TIBKAT
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作