DataFinder Dataset
收藏arXiv2023-06-07 更新2024-06-21 收录
下载链接:
https://github.com/viswavi/datafinder
下载链接
链接失效反馈官方服务:
资源简介:
DataFinder Dataset是由卡内基梅隆大学创建的一个用于科学数据集推荐的数据集。该数据集包含17887条查询记录,分为自动构建的大规模训练集(17495条查询)和专家标注的小规模测试集(392条查询)。数据集的创建旨在帮助研究人员根据自然语言描述的研究想法找到合适的数据集。数据集内容丰富,涵盖多种数据源和领域,支持多种信息检索算法。该数据集的应用领域广泛,主要用于解决机器学习和人工智能领域的数据集推荐问题,帮助研究人员快速定位和选择适合其研究需求的数据集。
The DataFinder Dataset is a curated dataset for scientific dataset recommendation developed by Carnegie Mellon University. It comprises 17,887 query records, split into a large-scale automatically constructed training set (17,495 queries) and a small-scale expert-annotated test set (392 queries). This dataset was designed to help researchers identify appropriate datasets using natural language descriptions of their research ideas. It covers rich content across multiple data sources and domains, and supports a diverse range of information retrieval algorithms. The dataset has broad applications, primarily addressing the dataset recommendation challenge in machine learning and artificial intelligence, and assisting researchers in quickly locating and selecting datasets tailored to their specific research requirements.
提供机构:
卡内基梅隆大学
创建时间:
2023-05-26



