Soft-Search
收藏arXiv2023-02-28 更新2024-06-21 收录
下载链接:
https://github.com/si2-urssi/eager
下载链接
链接失效反馈官方服务:
资源简介:
Soft-Search数据集由华盛顿大学信息学院创建,旨在识别和理解由美国国家科学基金会资助的研究项目中产生的软件。该数据集包含近1000个人工标注的软件生产标注,以及通过训练的预测模型推断出的超过150,000个NSF资助项目的数据。数据集的创建过程涉及使用GitHub API搜索和标注软件,以及通过NSF奖励的摘要和项目成果报告训练预测模型。Soft-Search数据集的应用领域包括软件生产的预测和分析,以及研究软件在学术出版中的作用。
The Soft-Search dataset was developed by the School of Information at the University of Washington, with the aim of identifying and understanding software produced by research projects funded by the U.S. National Science Foundation (NSF). This dataset includes nearly 1,000 manually annotated records of software production, as well as data for over 150,000 NSF-funded projects inferred via trained predictive models. The creation of the Soft-Search dataset involved using the GitHub API to search and annotate software, as well as training predictive models using abstracts and project outcome reports from NSF grants. Application areas of the Soft-Search dataset cover software production prediction and analysis, as well as research into the role of software in academic publishing.
提供机构:
华盛顿大学信息学院
创建时间:
2023-02-28



